Running OBAN, I’d like to be able to perform some actions whenever the number of attempts is exhausted and job is being transitioned to “discarded” state. So far I came up with a workaround in the Worker like:
increment the desired max_attempts value by one
implement perform() twice, first with a guard when attempt < @max_attemps, second w/o any guards but with unconditional {:error, :max_attempts_reached} return.
This way the number of actual attempts remains as before. All “real” attempts are being handled by the first perform(), while the additional attempt is handled by the second perform(), doing what I want to do before returning the final, unconditional error tuple. IOW - this kind-of works but surely has some drawbacks and also kind-of smells to me. I also find it unlikely that it’s only me who has a need to act on this important state transition so someone most probably has found better ways to do it. Am I right? Any suggestions?
Respond to specific events like failures or cancellations
So there is actually something like what I need. Although it talks about “batches” I still guess a “batch” may also consist of a single job, have batch_discarded/1 implemented and voila!
Thank you. From what I see these are also available only in “Pro” so I take the “OSS” version doesn’t have anything that could be used instead of my half-baked quasi-solution
Depending on the action you’re taking, you could also use the telemetry hooks to do something similar. I would only do this though if the actions you’re taking are like “Log a special error message”.
Overall though I don’t think your solution is particularly bad.
@sorenone, this is tangential (happy to open another topic if you’d like).
We sometimes get these jobs that are left in a moot state, where:
the status is discarded
there are no errors in errors
max_attempts == attempt (2 == 2 for example, in our case, we bumped it to 2 after having it as 1 for a while)
worker hook (after_process/3) doesn’t seem to fire (we are diagnosing what could trigger this, we think the container gets nuked before reaching it somehow)
I was wondering if you have seen this before in the wild?
There is also an option in Pro’s DynamicLifeline that will rope them back to the runable state to run again. If you want to be sure that these run, you have to increase the number of attempts or how long the system can run before shut down.