Phoenix json API extremely slow when serving CSV files

The steam can be chunked in order to send larger pieces.

Yes, I thought about it when I saw the interspersed commas were send on their own. A quick benchmark could tell how much entries should be buffered before sending each chunk. But our solution is already good enough for @quda I guess :slight_smile:

Thanks @mmmrrr . More clear now.

Tried with Stream.intersperse/2. For very big files is about 30% slower than the initial solution.

How can I make chunks of, lets say, 50kb each ? Is it possible ?

1 Like

Something like this, untested:

defp delay_chunk(conn, data, buffer, byte_counter) when byte_counter >= 65536 do
    {chunk(conn, Enum.reverse(buffer)), [data], 0} 
end

defp delay_chunk(conn, data, buffer, byte_counter) do
    {conn, [data | buffer], byte_counter+IO.iodata_length(data)}
end

You may need to add error handling and a final flush at the end, but you get the idea.

2 Likes

My petty contribution to the solution of @lud with Stream.intersperse/2:
file

...
|> Stream.intersperse(",")
|> Stream.chunk_every(2)

By applying Stream.every/2 with 2 as argument new chunks are created into the stream by combining the previous records with the comma from intersperse.
Proud of my finding! :relaxed:

5 Likes

What version of Elixir/Erlang are you running? I recently had an issue with File.stream being very slow, and upgrading Erlang from 24.0 to 24.1 fixed it.

1 Like

I’ve just upgraded it to 24.1 (from 22). Same speed for stream, don’t see any difference. But it’s quite fast, IMO, 60MB/4.5s

2 Likes