Oban jobs stuck at "scheduled" and then "available" if I move them there manually

There’s a possibility this is contributing to the problem if you’re using postgrex prior to v0.20, as there’s a bug with reconnections that makes the notifier get stuck in a disconnected state.

This is almost certainly the reason processing got stuck. Here’s the tl;dr to unstick it with a SQL query:

update oban_jobs set meta = meta - 'uniq_key' where state in ('retryable', 'scheduled') and meta ? 'uniq_key';

This happens because of partial unique states. If you have something like unique: [:available, :executing] , and it doesn’t apply to :retryable or :scheduled , then you can end up in a situation where jobs go to transition from scheduled -> available and there’s a unique conflict.Postgres only raises a single conflict exception at once, and that’s what the engine tries to use to fix the unique issue. However, with enough conflicts, it gets stuck in a loop and the jobs don’t progress.

That’s why the next Oban release has unique “groups”, rather than encouraging people to use individual states: oban/guides/learning/unique_jobs.md at main · oban-bg/oban · GitHub

It shows up after Pro v1.5 because it uses unique indexes, which actually enforce uniqueness all the time. In OSS and older Pro versions it used a combination of queries during insert—that wasn’t transactionally safe, and it made it easy to write broken state combinations.

1 Like