I am streaming a very large table, from postgrex with Postgrex.stream and converting that to csv and sending it to the browser over a socket, which is then downloaded by the user into a file.
I have two issues:
If I do this as an Enum, I can calculate the checksum AND send the data. I seem to be able to only do one of those if it is a stream, meaning, I can calculate or send the data, but not both without creating something intermediate (like a file). Is there a way to do this simultaneously? Is there a way to split this into two processes, using the same stream and one calculates the checksum and stores it and the other sends the same stream to the browser?
I need to put headers on the front.
My code:
def stream(channel, type) do
Repo.transaction fn ->
Sql.get_all_by_timestamp()
|> Stream.map(&(transform_rows(&1)))
|> Enum.chunk_every(500)
|> Enum.concat
|> Broadcaster.Download.send_chunk(channel, type)
end
end
My Elixir skills are light, my apologies if this is an obvious question.
This might not be a concern for your requirements, but functions in Enum are eager, so they create an intermediate list when running. Depending on the size and number of chunks, this could use a lot of memory.
Switching to Stream.transform for the checksums calculation might help you as all the Stream methods are lazy.