What threshold do you use to move a job to Oban versus keeping it in main app?

I’m still on a steep Elixir learning curve. I only recently learned about Oban. I’m trying to figure out when I should use Oban versus just doing the functionality in the main app.

All of the functionality I’m considering will be using Ecto and inserting into the database. Processing emails is a no brainer. I plan to use Oban for that.

Here’s where I’m on the fence. I allow the user to create recurring Events that have reminders. Let’s say the user creates a recurring event that happens every week for a year. So I need to save 52 Events. Now let’s say the user configures 3 reminders for each event. I must generate those reminders after I have an event_id. So that’s 52 Events and 156 Reminders.

Would you move the generation of the Events to Oban (or other Job processor)?
Would you move the generation of the Reminders to Oban?
Or would you do them both in the main app because Ecto could do it quickly enough and these actions (creating events) would be infrequent (unlike sending mail)?

Main criteria: if you want your background workers to be persistent and potentially retried if they fail. And if you have way too many to just naively spawn and hope for the best.

Another criteria: metrics and observability. I’d want my background tasks to report metrics and that’s easier achieved when they’re in their own modules, properly annotated.

4 Likes

I would move both Events and Reminders to Oban just because it already has :scheduled_at and :schedule_in functionality. See Oban.Job — Oban v2.13.4

Also, for the reasons dimitarvp outlined.

I’m afraid I’m too much of a beginner to thoroughly understand your reply. Hoping you don’t mind a couple more questions.

What did you mean by “naively spawn?” Do you mean that using something like a Task may not be able to manage the load?

Regarding metrics and observability … do you mean Telemetry (which is on my long list of things to learn)? Or would I need to write my own code to capture metrics? I’ve read about how to capture failed attempts, but I had not thought about metrics.

I’ve also been reading about this concept of “just let it die”. I’m trying to figure out when I need to report things to the user and when the lack of results speaks for itself. So if I used Oban for both the Events and Reminders and the Events couldn’t be generated … is it proper to observe that behavior and report it to the user? Or just let the user try again because they never saw the Event created?

I know some of these questions are subjective decisions but I’d really like to know what experts in Elixir would do. I don’t understand performance issues enough to have a basis for these decisions. Any guidance is most appreciated!

Yes. You are basically spawning background tasks in the manner of fire-and-forget. If something fails you’re very likely to never know (still depends on your setup of course, but since you’re a beginner you likely haven’t included integration with e.g. AppSignal, Honeycomb and the like).

Yes. It’s a can of worms and a deep well to dive into so friendly advice: pick the shortest and quickest guide you can find, make sure you get some metrics and forget about it for a while. It can swallow a ton of time to configure 100% like you want it.

Example: it’s a horrible user experience to click “Send me a password link” and the UI to show you absolutely nothing (in case something failed on the backend). You can start from imagining such situations and putting yourself in the shoes of the user of the system.

“Let it die” is thrown too liberally when advertising Erlang / Elixir. Like 99% of all commercial projects absolutely need to know if something died, why, how many times has it been retried, and did it eventually succeed. This is harder to achieve by just using Task (but I’ll repeat again: it’s doable and I’ve personally done it – it can be much more feasible compared to using a full-blown background task manager like Oban; it truly depends on your project needs).

Invest in having your code fail loudly so you can receive notifications in your monitoring dashboard – AppSignal, Honeycomb, NewRelic, Rollbar etc. This gives you a central place to look at the manifestation of obvious bugs which you then can fix whenever.

TL;DR: make sure what’s happening inside your app is visible.

1 Like

IMHO, Telemetry shall be higher than Oban on the list to learn. If you don’t need job persistence across reboot and you don’t have that many jobs which can cause a log jam, a simple GenServer can serve as a rudiment job queue. However, you would need some metrics even at that stage.

1 Like

Thank you for the advice! I’m realizing from both of your comments that metrics is a first class citizen like documentation and testing. Back to the books!

Is there a reason why I couldn’t just use Oban for everything? I realize that it may be overkill for some things, but does that matter? I feel like it is safer to use Oban (developed by someone 1000x more experienced than me) than for me to try to write a GenServer that can recover in case the GenServer goes down. I really like the idea that I can send things off to Oban and it will store the args in the database in case things crash.