Has anyone shared their successes or failures in setting up multi node redundancy for oban's database?

Looking for some of the funkier oban deployments people might be running in production or have simply tried and failed with.

A hobby side project has grown enough to warrant a good look at oban, and i’m in debate whether to finally give up some amount of redundancy/uniformity onto a single node to simplify the deployment with a stock postgresql setup.

For now i’ve been experimenting with an obviously unsupported deployment of Dolphin and MariaDB+Galera, to fit with the current database cluster before bolting on more parts, that seems to function well for the basics, but requires a custom fork to add an ecto type that can deal with the divergence of json implementations between MySQL proper and MariaDB.

Which leaves me wondering how others have dealt with this and what options have been evaluated already but left unpublished.

In the midst of giving up the vendored fork and setting up an unwieldy etcd+haproxy+patroni managed fail over, the option to run oban on yugabyte peeked my eye.

It’s only mentioned in the release notes for v2.18.0, but seems like the easiest way to quell my concerns of the entire thing falling over sideways after the wrong spot node goes down or data being lost in the case of a sqlite file per node.

This is my personal opinion of course, but I am very skeptical of all of these “retrofit HA/consensus onto existing DB” approaches. Building a distributed database which is actually correct is extremely difficult and requires careful testing. Really, it requires that you design the system for testing. TBH even the current crop of NewSQL databases weirds me out, though part of that is just that I really don’t feel like the “SQL OLTP RDBMS” (ouch) architecture works very well when naively sharded.

Flyio had for a long time a “HA Postgres” offering with failover powered by external consensus (it was Stolon) and it seemed like people constantly had their databases taken down by the failover mechanism, which is ironic but unfortunate. These things really are hard.

I would say just set up Postgres. You can do backups and point-in-time restore with pgBackRest. If you need stronger durability guarantees you can set up synchronous or asynchronous replication to another node (or nodes), perhaps even in separate DCs. And then just failover manually.

If you actually need HA, probably use a database that was actually designed for it. I have no idea what Oban supports there, though. FoundationDB is the best in the business IMO, and is open source and unlikely to be rugpulled (Apple would never bother). CockroachDB for example has already been rugpulled.

Also: Oban is just a job runner so it’s not like migrating to another DB would be a big deal.

1 Like

Until GitHub - fabianlindfors/pgfdb: Postgres made distributed using FoundationDB becomes a bit more realistic, i’ll have to stick with the knockoff that has been verified by the creators of the library.

Yugabyte turned out to be trivial to install and cluster, and ran fine under synthetic job load with a yanked node.

1 Like

Oh yeah, I definitely wouldn’t go messing with some experimental Postgres FDB wrapper! I’m really not convinced running a database like Postgres on FDB is a good idea. Like I said, I don’t think SQL databases as they exist distribute well. Mainly because you really want to turn the indexes inside out (multitenancy). Some are trying to build multitenant SQLite instead (e.g. Turso), which is an interesting idea, but of course SQLite is not really designed for that either; managing schemas etc becomes annoying.

Anyway, like I said it’s probably best to stick with just Postgres, or one of the NewSQLs like Yugabyte if you aren’t afraid of the rugpull. See also: CockroachDB, TiDB, YDB, Planetscale, Spanner, Aurora DSQL, normal Aurora, Neon Postgres.

Side note: Oban on FDB would be a pretty good match if it was compatible. As a job runner I assume it doesn’t use all that much SQL functionality. @jstimps has been working on an Ecto FDB wrapper, so maybe some day!

1 Like

I don’t know much about Oban, but after a brief look, I think Oban.Engine and Oban.Notifier are probably doable with current EctoFDB.

However, Oban.Migration appears to rely on Ecto.Migration, which is specific to ecto_sql. First iterations of EctoFDB attempted to use ecto_sql, but it felt like getting a square peg in a round hole due to certain SQLisms, so that was abandoned. Today, EctoFDB implements its own migration system. Unknown if that is too far outside the bounds of Oban’s design to be workable. That being said, I would be happy to include any changes necessary to make it work.

2 Likes

Until we have something better (or just different) than Ecto, I would not give up on this very easily.

I will be writing the ecto_sql compatibility layer for my upcoming SQLite library (with a ton of features and so far rock-hard; it still sits at 90% complete due to very limiting time/energy budget lately :weary_face:) and we can help each other out on this front, if you like. It would be fresh in the mind of both of us and I am seriously loving everything I have seen about FoundationDB, so might be able to help in a few months.

1 Like