[Question summary] How best to track work being performed, when this info must be recalled via
- worker pid (workers send messages to a GenServer manager)
- monitor ref (workers are monitored and work needs to be enqueued again on worker crash)
- struct id (multiple work attempts per struct need to be tracked across worker restarts to prevent infinite loops)
I’m sure this is a common occurrence, so I’d love to know how best to address this (both for my limited concurrency case, and in a more general case with “typical” concurrency)…
Let’s say I have a list of
Person structs containing a birth date, and I want to process all of these (e.g. compute their current age). I have GenServer that monitors and tracks the success/failure of each
Person that gets processed.
A new worker process is started to handle each
Person that needs to be processed. The GenServer is notified when the processing of a
Person starts, as well as when it ends successfully.
The GenServer will
Process.monitor the pid of the process working on a
Person struct, and if the process crashes, will enqueue the
Person again. If a given
Person fails to be processed multiple times, we log an error and bail.
The above setup means that the “processing state” for a given
Person struct needs to be accessed by:
- worker pid (so the managing GenServer state can be updated when the worker reports it was successful)
- monitor ref (if a worker crashes, the work needs to be rescheduled)
- struct id (we want to track the number of processing attempts to prevent infinite loops)
What’s the best approach here? Simply use multiple maps (e.g.
struct_id => all_info,
monitor_ref => struct_id, and
worker_pid => struct_id) and make sure to keep them updated in sync?
For what it’s worth, the concurrency is going to be extremely limited (let’s say a dozen concurrent processes) because it involves third part resources that shouldn’t be overwhelmed. Given that, should ETS be used with the struct id as the main key, and accept that a full table scan will happen when looking up via worker pid or monitor ref?
(I’m trying to not muddy the waters too much with specifics, but here’s some more info on what I’ve got going: the structs get processed within a GenStage pipeline with limited concurrency. Each struct gets processed via a
GenStage.ConsumerSupervisor. If a struct fails to be processed, I want to re-emit it from the GenStage producer so it can be tried again. If the same struct fails multiple attempts, and error is logged and the struct will not be re-emitted by the producer.)