I’m implementing a controller that schedules a background job and returns.
The job itself is implemented as a GenServer Foo.
I wanted this to be reliable, so that if Foo fails, it is restarted, different Foo jobs are distributed, and that I have a track record of what went wrong and right.
Coming from ruby I wanted similar guarantees to Sidekiq, and so I decided I’ll use Oban workers.
I created an Oban worker that start_links to the GenServer Foo and waits using receive for a message from Foo, that it has completed its job.
I thought that if Foo dies, the worker will die as well and Oban will restart it. Wrong: Oban worker will only be restarted if it raises and error, not if it gets killed.
So instead, i did Process.monitor on Foo’s pid, and I’m checking if the message from Foo is the completion message, or : DOWN message. If it’s the latter, I’m raising an exception so Oban retries this job later.
This works but I feel like suddenly I’m implementing some kind of supervisor mechanism and I should probably use mechanisms that are already present in Elixir. What would you suggest here? Also the necessity to pass “call me back” pid to Foo so it sends back information about successful completion seems like unnecessary coupling. Maybe this is not the case for using Oban?
For those interested in details: the job is to download a huge file and process it. The controller endpoint just accepts the URL of huge file and schedules this job.