I am using oban to create a group of tasks, all of these tasks are linked to the same parent. After all the tasks are complete, I want to add additional information to that parent that the tasks have completed, so a different status can be rendered to the user.
Additionally to this, there is no guarantee on the order and concurrency of executed tasks, some might fail and require a retry, the server might be restarted when a new deploy is done (hence why oban is used in the first place).
I was thinking on a few approaches:
Exectue a query to check if other jobs are complete at the end of every job, if yes update the status. I do have my fears about this not working correctly in concurrent setting, maybe someone knows more on this topic;
Have a background worker that will periodically check number of jobs completed and update the status, this of course implies some potential delays and queries executed, but for my project it is a viable solution;
Monitor the tasks with a process, re-synchronize the process based on data from the database when the server is restarted.
If you have any better idea it would be great to hear a different perspective.
Though I’m not 100% sure. If you add DB updating logic at the end of each job in the chunk, and have another job sniffing the database and waiting for the data structure to say “all related sub-jobs have completed successfully”, then it will likely achieve what you need.
Though now thinking of it, that could be achieved with Oban’s free version as well.
I never got to that, as that project got frozen for the time being, however implementing the solution I outlined as number 2 should be more than good enough if you don’t care about this happening instantly.
This can be as easy as creating a genserver that will query the database every N seconds/minutes and update the status, should be no more than a couple of lines of code that is foolproof.
That looks like a good idea, however I have a concern where I tried using Batch in the past and dropped it because we handle our own retries. We wanted to retry jobs based on specific errors, not all, so we insert each retry as a new job. I believe I may run into the same issue here with Workflow.
Errors are recorded in the errors array on the job anyhow. So if you return {:error, error}, or raise, or crash, that’s listed in errors, but it’s not duplicated as a recorded value.
Hi, yes I’m aware. Unfortunately for our use case, we need the error value recorded as we save it in a different table to be used as part of our data, and it’s not feasible to be retrieving it by querying the oban_jobs table. The error message in the errors array is also parsed into a string message so we would need to re-parse our parsed errors in that case. However we do find the errors array useful for viewing the raw errors