I am working with an external API (external from the Elixir server at least) and I want to minimize the amount of API calls to a particular endpoint. It feels like GenStage is a good fit for this use case but I’ve had a little trouble getting everything to line up.
I want to:
- Collect all events every 100ms
- Batch the events into batches of up to 150
- Have three workers running in parallel to send these batched requests to the remote API
These seem to fit into three GenStage stages
-
Collector
- GenStage:producer
that collects events from other parts of the elixir system in a FIFO queue (based on:queue
) and sends the events only inhandle_demand/2
-
Batcher
- GenStage:producer_consumer
that asks theCollector
for 1000 events every 100ms and batches them into batches of 150 (runs in:manual
mode) -
RequestSupervisor
- GenStageConsumerSupervisor
that requests the batched events from theBatcher
and starts workers that call the external API
The code I have seems to work but I feel like I may be going against the ethos of GenStage since I’m not really propagating demand all the way up the chain. Specifically the Batcher
and the Collector
both mostly ignore demand. The Batcher
is set to :manual
mode and asks for a static 1000 events every 100ms.
Another issue is that this setup will always incur a penalty of 100ms on each event even if there are more than 150 events that are added to the Collector
at once.
But also keep in mind that I only expect about 5-10 events every 100ms (and maybe even less). But I want to have a good base for future scaling if necessary.
Any thoughts on this architecture? Is there anything that I haven’t considered that I should consider?