Library for exporting CSV

I need to read data from a database, process and transform it and send to a user in CSV.

I’ve found 2 libraries for far: csv and nimble_csv. Both look decent.

Among those 2 or among others, what would you recommend?

Or maybe I don’t need one at all?

1 Like

In my experience nimble_csv should be significantly faster.

8 Likes

I confirm this from a bad behaviour I experienced with csv. The time taken to dump a CSV file (600 rows, 574 columns) went from ±210s with csv to ±430ms with nimble_csv. It looks like it could be the fault of protocols not being consolidated or the fact that we use the default implementation for Any instead of specialized ones for the built-in types (Integer, Float, …) which are not provided by csv (except for strings). See closed issues: CSV.encode seems extremely slow and Running in production/release slow. I don’t know what is the problem and I do not have the time for the moment to explore the matter.

EDIT
I found the time to make some tests :slight_smile: The protocols are consolidated. The problem comes from the default protocol implementation for Any which uses the implementation for BitString which does extra work for strings and that work is not necessary for simpler types. I have provided implementation for Integer, Float, Atom (for true and false), DateTime and it was quite fast. Going from ±210s to 430ms too :+1:.

defimpl CSV.Encode, for: Integer do
  def encode(data, _env \\ []) do
    data |> Integer.to_string
  end
end
defimpl CSV.Encode, for: Float do
  def encode(data, _env \\ []) do
    data |> Float.to_string
  end
end
defimpl CSV.Encode, for: Atom do
  def encode(data, env \\ [])
  def encode(true, _env), do: "1"
  def encode(false, _env), do: "0"
  def encode(data, _env) do
    data |> Atom.to_string
  end
end
defimpl CSV.Encode, for: DateTime do
  def encode(data, _env \\ []) do
    data |> DateTime.to_string
  end
end

NOTE
I should have mentionned that the times written in the answer take into account the production of the data using :ets.select (and its continuation technique) and the export to CSV.

4 Likes

Sounds like that would make a great PR :slight_smile:

Since issue 10 Provide default implementation of CSV.Encode where appropriate was closed by mentioning issue 9 CSV.encode seems extremely slow, I added a comment to that issue. I guess the owner has deliberately left the protocol implementation open like that to force us to implement what we want. IMHO it is counter-intuitive for someone who starts using the library for the first time.

2 Likes