Phoenix json API extremely slow when serving CSV files

krasenyp · November 17, 2021, 7:15pm

The steam can be chunked in order to send larger pieces.

lud · November 17, 2021, 7:30pm

Yes, I thought about it when I saw the interspersed commas were send on their own. A quick benchmark could tell how much entries should be buffered before sending each chunk. But our solution is already good enough for @quda I guess

quda · November 17, 2021, 8:54pm

Thanks @mmmrrr . More clear now.

quda · November 17, 2021, 8:56pm

Tried with Stream.intersperse/2. For very big files is about 30% slower than the initial solution.

quda · November 17, 2021, 9:00pm

How can I make chunks of, lets say, 50kb each ? Is it possible ?

derek-zhou · November 17, 2021, 10:22pm

Something like this, untested:

defp delay_chunk(conn, data, buffer, byte_counter) when byte_counter >= 65536 do
    {chunk(conn, Enum.reverse(buffer)), [data], 0} 
end

defp delay_chunk(conn, data, buffer, byte_counter) do
    {conn, [data | buffer], byte_counter+IO.iodata_length(data)}
end

You may need to add error handling and a final flush at the end, but you get the idea.

quda · November 18, 2021, 9:20pm

My petty contribution to the solution of @lud with Stream.intersperse/2:
file

...
|> Stream.intersperse(",")
|> Stream.chunk_every(2)

By applying Stream.every/2 with 2 as argument new chunks are created into the stream by combining the previous records with the comma from intersperse.
Proud of my finding!

APB9785 · November 19, 2021, 4:51am

What version of Elixir/Erlang are you running? I recently had an issue with File.stream being very slow, and upgrading Erlang from 24.0 to 24.1 fixed it.

quda · November 19, 2021, 12:34pm

I’ve just upgraded it to 24.1 (from 22). Same speed for stream, don’t see any difference. But it’s quite fast, IMO, 60MB/4.5s