Production issue with Oban - partitioning performance/priorization?

Hi!

Let’s say I have a queue that is configured like this:
my_queue: [local_limit: 50, global_limit: [allowed: 1, partition: [:worker, args: :my_id]]]

then I receive a imbalanced load between workers:

  • 20 jobs/s for worker1
  • 1 job/s for worker2

our understanding was that we would be that worker1 load would not impact worker2 processing and that only worker1 jobs would queue up.

But in our prod we had an issue where worker2 job were not picked up because of worker1 load.

another detaile:
worker1 have a higher priority that worker2

thanks

Your understanding is correct, with a minor caveat. The partitioning query has a limited view of jobs in the queue that it can check for jobs. That limit is in place to prevent the query from using unbounded memory, and it defaults to 5k jobs at a time (it is configurable).

How many jobs were in the queue for each job? Was it really 20 and 1? Also, which version of Pro are you using?

So if I have a partition with 2 workers with different priority and more than 5k messages, partition/priority will not be applied because it only fetches 5k messages. right?

More than 20k and yes we were really at 20 against 1 or even more.

Oban pro 1.4.11

We will have some changes on our side so we process more jobs in parallel, but that 5k details is very important to know in case of outage and when our queues fill for some unpredicted reason.

Thanks!

That’s correct. It’s an implementation detail you shouldn’t have to be aware of, and definitely something we want to improve.

The limit is purposefully conservative because the engine has no knowledge or control over how large a database is. Feel free to increase the limit to a higher value that’s appropriate for your queue depth and database size.

Thanks a lot for you responses and details!!

1 Like