This is going on for a while now. I think it’s best described with an example:
We have an Oban queue “postgresql” with 100K jobs available.
We deploy production, the nodes start and the queues start processing the jobs.
Then the queues start to disappear (no errors in logs) and the processing is slow-ish, it is still going but on the nodes the queues appear / disappear kind of randomly and at one point usually 1/3 of them are running.
And this happens only with a few queues, usually the ones that have a larger number of jobs waiting.
Anybody ever had this problem?























