Can´t generate file (csv) with special chars

have you tried starting the file with a utf8 BOM? We are using a BOM at the start of the csv export file and excel is correctly recognizing the char set as uft8 and is therefore displaying all umlauts as it should.

see: https://en.wikipedia.org/wiki/Byte_order_mark

This is what I was just coming in to suggest. Excel is stupid and needs the non-standard (for about anything else nowadays) BOM to mark such CSV files.

The stackoverflow link I posted above discussed BOM and different versions of Excel do not behave consistently with it. Perhaps the versions of Excel someone might care about today is smaller and it is a solution though.

2 Likes

Thanks for all your input. Then BOM seems to be the best solution for newer versions of office.

One more thing, then opening the csv file in Excel, you can select code
page - utf-8 is on that list.

Tallak

I came across this thread when I was having similar issues displaying CSV files in Excel. After some trial-and-error with different encodings and BOM chars I found the best solution was to encode the CSV as tab-delimited UTF-16 LE.

I put together a blog post on the topic here in the hope it helps the next time someone has the same problem:

3 Likes

Thanks a lot for doing this, I’m working through it now.

A couple of tiny typos in there btw:

# missing closing "
NimbleCSV.define(CSVParser, separator: "\t", escape: "\")

  # missing ,
  @user_data [
    ~w(Date Name Age City Comment)
    ~w(2018-08-06 Heike 23 Köln Hallo👋),
   ...
1 Like

Excel friendly handling is now available in nimble_csv itself :slight_smile:

2 Likes

I don’t have Excel but my clients do, so I am operating on slow feedback loop, but I’ve tried to use the NimbleCSV.Spreadsheet parser to no effect.

Here’s my code:

# this is our NimbleCSV parser which we define inline for tab-delimited CSV
NimbleCSV.define(NimbleCSV.Spreadsheet, [])

defmodule CSVSpike.Writer do
  @user_data [
    ~w(Date Time Name Age Price City Comment),
    ~w(2018-08-06 11:00 Heike 23 €2.00 Köln Hallo👋),
    ~w(2018-08-07 21:00 Jürgen 44 £2.00 München 😸Tschüß❤️)
  ]

  def write_csv_file do
    File.write!("users.csv", data_to_csv())
  end

  defp data_to_csv() do
    @user_data
    |> NimbleCSV.Spreadsheet.dump_to_iodata()
  end
end

Does anything look wrong with that?


My clients (both windows and mac users) report this as not formatting correctly in their respective versions of Excel.

These are the settings that are passed by default by the Spreadsheet parser:

NimbleCSV.define(NimbleCSV.Spreadsheet,
  separator: "\t",
  escape: "\"",
  encoding: {:utf16, :little},
  trim_bom: true,
  dump_bom: true
)

Firing up an iex -S mix:

CSVSpike.Writer.write_csv_file() ; File.read!("users.csv")
"Date,Time,Name,Age,Price,City,Comment\n2018-08-06,11:00,Heike,23,€2.00,Köln,Hallo👋\n2018-08-07,21:00,Jürgen,44,£2.00,München,😸Tschüß❤️\n"

Am I supposed to see the BOM prepended here?

Ahh I think I have found the issue. Here it works in iex:

iex(1)> "Date" |> IO.inspect(binaries: :as_binaries)
<<68, 97, 116, 101>>
"Date"
iex(2)> :unicode.encoding_to_bom({:utf16, :little})
<<255, 254>>
iex(3)> [["Date"]] |> NimbleCSV.Spreadsheet.dump_to_iodata()
[<<255, 254>>, [<<68, 0, 97, 0, 116, 0, 101, 0>>, <<10, 0>>]]
iex(4)> [["Date"]] |> NimbleCSV.Spreadsheet.dump_to_iodata() |> Enum.map(& IO.inspect(&1, binaries: :as_strings))
"\xFF\xFE"
["D\0a\0t\0e\0", "\n\0"]
[<<255, 254>>, [<<68, 0, 97, 0, 116, 0, 101, 0>>, <<10, 0>>]]
iex(5)>

Defining the parser at the top of the file was the issue. I don’t quite understand this API but incase anyone else has the same issue, just make sure not to run: NimbleCSV.define(NimbleCSV.Spreadsheet, []) at the top of the file.

iex(5)>
nil
iex(6)>
nil
iex(7)> NimbleCSV.define(NimbleCSV.Spreadsheet, [])
warning: redefining module NimbleCSV.Spreadsheet (current version loaded from _build/dev/lib/nimble_csv/ebin/Elixir.NimbleCSV.Spreadsheet.beam)
  deps/nimble_csv/lib/nimble_csv.ex:211

{:module, NimbleCSV.Spreadsheet,
 <<70, 79, 82, 49, 0, 0, 46, 112, 66, 69, 65, 77, 65, 116, 85, 56, 0, 0, 3, 15,
   0, 0, 0, 69, 28, 69, 108, 105, 120, 105, 114, 46, 78, 105, 109, 98, 108, 101,
   67, 83, 86, 46, 83, 112, 114, 101, 97, ...>>, :ok}
iex(8)> [["Date"]] |> NimbleCSV.Spreadsheet.dump_to_iodata()
[["Date", 10]]
1 Like

NimbleCSV already defines the module for you. NimbleCSV.define is only needed if you need a module with custom settings. You basically overrode the predefined module.