Grokking the options for distributed workloads in 2024

badfun · September 16, 2024, 3:48pm

I’m weighing my options for how to handle a workload that I want distributed across a number of nodes to be run in parallel. My specific use case is closed-source but as a thought experiment we can use the ffmpeg example from the FLAME readme. Let’s say I’ve got a workflow that looks something like this:

Parse a csv containing filenames of video files on my API node
Stream the filenames to some kind of runner pool to be processed
Process each filename on a node from the runner pool
When all files have been processed, do something (e.g. set status to complete in a table)

I was considering using Horde/libcluster to manage a pool of supervised workers, or possibly running a set of pods configured to just be Oban workers, and chunk the filename stream into Oban jobs. But now that FLAME is on the scene, I’m wondering if this might be an easier way to accomplish what I’m going for. Does anyone have any thoughts?

sorentwo · September 16, 2024, 4:39pm

Oban and FLAME play very nicely with each other. As demonstrated in Chris’s keynote from ElixirConf EU, there are three elements to asynchrony:

Asynchronous — Tasks
Elastic — FLAME
Persistent — Oban

FLAME helps you scale elastically to multiple nodes, but chances are you also want mechanisms for retries, scheduling, backpressure, instrumentation, etc. That’s the part that Oban provides.

SirWerto · September 17, 2024, 9:30am

If you just need to launch manually a task across several nodes, take a look on rpc