Separating Consumers from Producers in different BEAM instances

Overview

For security compliance reasons, I need to ensure a complete separation between some of the consumers, and the producers that submit jobs to the queue. It’s sufficient to have the jobs go to a common PostgreSQL instance, but the Erlang VMs must be separate, to provide a strong security boundary.

In the past I’ve used RabbitMQ very successfully, but we don’t have a message broker yet in this project, and Oban’s workflows will be very useful for other aspects of the project.

Limitations

This rules out Erlang distributions obviously, and also PG Bouncer, but this is still enough to run Oban successfully using PostgreSQL LISTEN/NOTIFY. I asked this question on Slack, and got enough of a feel to implement this, so far it seems to work well enough.

Questions

  • have I missed anything obvious from the proposed solution?
  • has anybody been down this path, and has some war stories to share?
  • are there any other patterns to deal with this situation?
  • what’s the best way to notify the Producer that the work has completed? Note there’s no PubSub available between nodes. I found PostGrex.Notifications & Postgrex PubSub as well as Oban.Notifier

Current Solution

  • a front end Phoenix app, with migrations, and Oban job submission

  • a backend Elixir app, without migrations, but sharing Repo & Oban config

  • a common Oban.Worker library module, used in both apps

  • the primary has no queues defined via queues: []

  • the secondary has the necessary queues defined queues: [default: 10]

  • for notification the built-in Oban.Notifier is a better choice and works with the same underlying mechanisms.

2 Likes

Nothing obvious to me. It’s fairly common to split up Oban between worker and non-worker (i.e. web) instances. In that case the worker code is usually shared between the applications, but that’s not a requirement.

Queue producers don’t need to know when work is completed, just that jobs are available. That’s one of the many places where PubSub notifications are used for coordination between nodes. The others would be for starting, stopping, pausing, resuming, and scaling queues; not to mention sharing metrics for Web.

The default notifier is Postgres based and there’s no need to bring in a separate Postgres notifier. That will work perfectly fine as long as the nodes running jobs share a database, and there isn’t a connection pooler.

Your current solution is spot on. This is something people ask about frequently and should probably be in an offiical guide.

3 Likes

Thanks! So far it seems to work very well. Should we expect a threshold of job volumes, after which performance begins to degrade?

Maikel suggested we can actually keep everything in a single Elixir app / codebase / repo, and just modify runtime.exs to selectively start the appropriate supervisors in application.ex. I’d originally envisioned separate Apps in the same git repo, but this is simpler.

For splitting the application, there’s no difference at all. Performance is mostly limited by the power of the database and your usage patterns.

The Postgres notifier is similarly limited. It requires additional transactions to emit events, and it can’t handle large payloads for metrics. Disabling insert triggers cuts down on the number of notifications drastically though.

1 Like