[Question summary] How best to track work being performed, when this info must be recalled via
- worker pid (workers send messages to a GenServer manager)
- monitor ref (workers are monitored and work needs to be enqueued again on worker crash)
- struct id (multiple work attempts per struct need to be tracked across worker restarts to prevent infinite loops)
I’m sure this is a common occurrence, so I’d love to know how best to address this (both for my limited concurrency case, and in a more general case with “typical” concurrency)…
Let’s say I have a list of Person
structs containing a birth date, and I want to process all of these (e.g. compute their current age). I have GenServer that monitors and tracks the success/failure of each Person
that gets processed.
A new worker process is started to handle each Person
that needs to be processed. The GenServer is notified when the processing of a Person
starts, as well as when it ends successfully.
The GenServer will Process.monitor
the pid of the process working on a Person
struct, and if the process crashes, will enqueue the Person
again. If a given Person
fails to be processed multiple times, we log an error and bail.
The above setup means that the “processing state” for a given Person
struct needs to be accessed by:
- worker pid (so the managing GenServer state can be updated when the worker reports it was successful)
- monitor ref (if a worker crashes, the work needs to be rescheduled)
- struct id (we want to track the number of processing attempts to prevent infinite loops)
What’s the best approach here? Simply use multiple maps (e.g. struct_id => all_info
, monitor_ref => struct_id
, and worker_pid => struct_id
) and make sure to keep them updated in sync?
For what it’s worth, the concurrency is going to be extremely limited (let’s say a dozen concurrent processes) because it involves third part resources that shouldn’t be overwhelmed. Given that, should ETS be used with the struct id as the main key, and accept that a full table scan will happen when looking up via worker pid or monitor ref?
(I’m trying to not muddy the waters too much with specifics, but here’s some more info on what I’ve got going: the structs get processed within a GenStage pipeline with limited concurrency. Each struct gets processed via a GenStage.ConsumerSupervisor
. If a struct fails to be processed, I want to re-emit it from the GenStage producer so it can be tried again. If the same struct fails multiple attempts, and error is logged and the struct will not be re-emitted by the producer.)