What is the best way to do stateful background jobs

There’s a lot that can be said here, but I’ll try to keep it brief

To start with, we’re gonna set aside brief stuff that can just be handled with Task or synchronously. There are legitimate scenarios where you want a persisted queue, and limited concurrency.

There are at least 2 major challenges to overcome, and these challenges are why people still choose redis. I think there are viable alternatives, but none are nicely encapsulated in a library at the moment.

  1. persistence
  2. consistency

With persistence, when a server goes down and comes back up, how is the job queue and job state rebuilt? GenServers lose their state when they die. :ets loses its data when it dies. Redis (like a database) lets you offload this problem

Consistency is another big challenge. If you’re running N nodes are you running N queues with N worker pools? Or are you trying to have a job queued in Node 1 possibly end up with a worker in Node 2? How do you handle the myriad of pitfalls associated with distributed data? :mnesia gives you a lot of answers, but has problematic net split behaviour. There are apparently some libraries in the erlang world for handling netsplit recovery in an automated way but both mnesia and these other tools lack modern elixir oriented documentation.

Redis “solves” this by just not being distributed. It’s simple, and for many it works.

These are hard challenges, and I don’t think any of the existing job libraries offer a compelling alternative. Most simply pick the Sidekiq route, which is a reasonable choice to make.

I think there alternatives are possible, and it’s an active area of interest for me, but for now you’re stuck rolling your own or just biting the bullet and using redis.

10 Likes