I’m weighing my options for how to handle a workload that I want distributed across a number of nodes to be run in parallel. My specific use case is closed-source but as a thought experiment we can use the ffmpeg example from the FLAME readme. Let’s say I’ve got a workflow that looks something like this:
- Parse a csv containing filenames of video files on my API node
- Stream the filenames to some kind of runner pool to be processed
- Process each filename on a node from the runner pool
- When all files have been processed, do something (e.g. set
status
tocomplete
in a table)
I was considering using Horde/libcluster to manage a pool of supervised workers, or possibly running a set of pods configured to just be Oban workers, and chunk the filename stream into Oban jobs. But now that FLAME is on the scene, I’m wondering if this might be an easier way to accomplish what I’m going for. Does anyone have any thoughts?