I have a single file data dump (one JSON object per line) that I need to split into multiple files based on a key in the source file.
I’ve written it in python and just dynamically open file handles and store them in a map/dict as I process the file. After iterating over the whole file, I can iterate over my map/dict of handles and write/close them.
I’m trying to do this in Elixir and not sure how to go about this. I was thinking of storing the collected events in a map, but can’t as there is no state. Doing a file append for every line is very inefficient and slow.
Here’s my file append code
stream = File.stream!(filename, [:read, :compressed], :line)
stream
|> Stream.each(fn line ->
event = Poison.decode!(line)
File.write Map.get(event, "eventType"), line, [:append]
end)
|> Stream.run()