Best approach to make expected errors not consume retries in Oban

sezaru · June 4, 2022, 3:41pm

Imagine the following scenario: I have a queue that will have jobs specific for some user of the system. That job will have to access the DB and do some changes there.

These jobs are critical, meaning that they should always execute successfully except when there is a real catastrophic error since them failing means an invalid state for that user. This means that common errors (ex. the DB being temporarily unavailable) should not count as a retry attempt and the job should keep retrying until it was successful.

The question is, what is the best approach to implement this with Oban?

As an example, let’s say the Db is temporarily unavailable and I consider this a “common error”, in this case if my job throws a DBConnection.ConnectionError error, I want the job to keep retrying even if the max retries was reached.

al2o3cr · June 4, 2022, 5:20pm

You could catch the “common errors” and return :snooze instead of an error:

https://hexdocs.pm/oban/Oban.Worker.html#module-snoozing-jobs

sezaru · June 4, 2022, 5:42pm

Yeah, I read about the snooze feature, but I thought I couldn’t use for it because of this part of the documentation:

Snoozing does not change the number of retries remaining on the job, but it does increment the attempt number each time the job snoozes, which will affect the default backoff exponential retry algorithm. In the example below the backoff/1 callback compensates for snoozing:

Maybe I interpreted it wrongly, but I thought that this would still keep counting the retries and stop retrying even if I return :snooze if the max retry is reached.

I will try it out and see if it works for my case, thanks!

lud · June 5, 2022, 7:44pm

Snooze will bump the number of attempts but will also bump the max attempts limit, so you should be good.