I have a bit of a mystery on my hand. I have Oban job that backfills data, in batches. It was supposed to run for hours, but it gets killed with ** (EXIT from #PID<0.39361611.0>) killed after around 5 minutes. Sometimes it’s 4.5, sometimes it’s 6.5 minutes, but that’s the range.
Now, I do have timeout set to :infinity on this job. I also have other jobs, that sometimes take 10, 20 or 40 minutes and do not get killed.
The job goes through a bunch of data, quite a lot actually, and in a recursive manner but it’s not a leak as in - the functions are properly tail call optimized.
If I just start the job with spawn fn → MyWorker.perform(:ignore) end it works as expected. Memory does not leak, it’s stable, the script runs for hours with no issues.
But if I start it from Oban, it’s 5-6 minutes and it gets killed.
Anyone has ideas what this can be?




















