How to process events in a batches with the Flow

I have a csv_file in which
a.) first, each rows need to converted to xml and
b.) second, converted xml will be send to rails side for some database write operation.

Below is my Flow code for the same.

flow = csv_rows
 |> Flow.from_enumerable()
 |> Flow.partition
 |> Flow.map(&(CSV.generate_xml(&1)))
 |> Flow.map(&(CSV.save_to_rails_databse(&1)))
 |> Flow.run

Everyting is working fine for the small csv file, but when the csv_file is very large(suppose 20,000) records, then performing the second operation(i.e writing to database on rails side) is trying to insert two many records at the same time, since elixir is sending too many request to the rails side at the same time, therefore database is reaching at its peak limit.

Will it be good to process the events in the batch of 50, and will the min_demand and max_demand will be useful in this case.

1 Like

Yep. Or you could increase the max_demand to much larger and increase the DB timeout too for larger batches, depends on how your database is tuned.

1 Like

also tune :stages I would think… but I haven’t got much Flow experience…

btw this one might be relevant (although it uses stream), if possible you should use COPY:


1 Like

You would want to look at using Flow.reduce with a trigger_every set to whatever batch size you’d like to write in one go.

That way it’ll let you collect a given number of events coming through the flow at given check points

2 Likes