Grokking the options for distributed workloads in 2024

I’m weighing my options for how to handle a workload that I want distributed across a number of nodes to be run in parallel. My specific use case is closed-source but as a thought experiment we can use the ffmpeg example from the FLAME readme. Let’s say I’ve got a workflow that looks something like this:

  • Parse a csv containing filenames of video files on my API node
  • Stream the filenames to some kind of runner pool to be processed
  • Process each filename on a node from the runner pool
  • When all files have been processed, do something (e.g. set status to complete in a table)

I was considering using Horde/libcluster to manage a pool of supervised workers, or possibly running a set of pods configured to just be Oban workers, and chunk the filename stream into Oban jobs. But now that FLAME is on the scene, I’m wondering if this might be an easier way to accomplish what I’m going for. Does anyone have any thoughts?

Oban and FLAME play very nicely with each other. As demonstrated in Chris’s keynote from ElixirConf EU, there are three elements to asynchrony:

  • Asynchronous — Tasks
  • Elastic — FLAME
  • Persistent — Oban

FLAME helps you scale elastically to multiple nodes, but chances are you also want mechanisms for retries, scheduling, backpressure, instrumentation, etc. That’s the part that Oban provides.

3 Likes

If you just need to launch manually a task across several nodes, take a look on rpc :slight_smile:

1 Like