Doing something every few hours, the simplest approach?

I would like to perform an action periodically, and I am looking for the simplest reasonable way to do this.

As an example, let’s say I want to delete old entries from a db table, every few hours. The action is idempotent, relatively inexpensive, precision “doesn’t matter,” and concurrent executions of multiple runs (e.g. from multiple replicas of the service) is not a problem (db transactions will handle them safely).

A commonly recommended approach for doing this seems to be a variation of using a GenServer with send_after (or, I suppose, spinning up an oban job;).

This makes a lot of sense, but I coded up just running an infinite supervised “do”+“sleep” recursion, and it seems to work, and it seems ridiculously concise and simple, and I can’t explain to myself why that wouldn’t be enough.

Is there any reason why I shouldn’t do this? ; )

If you have any thoughts / guidance on this, I would much appreciate them!

Thank you!

PS An example of what this might look like in code:

application.ex (starts the task, restart: :permanent):

defmodule MyApp.Application do
    def start(_type, _args) do
        children = [
           ...,
           Supervisor.child_spec(
               {Task, fn -> MyApp.Sweeper.delete_old_data(sleep_ms) end},
               restart: :permanent
           ), 
           ...
        ]
        Supervisor.start_link(children, strategy: :one_for_one, name: MyApp.Supervisor)
    end
end

where delete_old_data() just keeps doing the thing and sleeping, forever:

defmodule MyApp.Sweeper do
    def delete_old_data(wait_ms) do
        ... delete old things ...
        Process.sleep(wait_ms)
        delete_old_data(wait_ms)
    end
end
2 Likes

Maybe using Oban: Reliable Scheduled Jobs — Oban v2.14.2

1 Like

You are not supervised if You do so… what happens if delete old things fails?

UPDATE: Yes You are… sorry didn’t read your code properly :slight_smile:

1 Like

What happens when this needs to shut down? I assume it will snooze right through the first round of “polite” notifications from its supervisor…

Yes, this is exactly what I am curious about – can my solution be more lightweight than requiring introducing a dependency (especially as rich as oban, with its own database dependency and schemas, etc.).

I don’t need persistence for my “job” (which is part of what Oban provides) I just need to call a function every once in a while, for as long as my application is running.

Can a simple BEAM process (with receive after) and a supervisor be sufficient?

1 Like

What happens when this needs to shut down? I assume it will snooze right through the first round of “polite” notifications from its supervisor…

This is a good point, but does it matter?

It seems like the process will be killed within 5 seconds (I am just using the default :shutdown value according to Supervisor — Elixir v1.14.3 ), which seems fine – I am obv only using delete_old_data() for its side effects, and I do want the process to disappear (whether it is sleeping or asking the db to do the pruning) when the application shuts down.

Is there any reason I should care?

1 Like

I would use a GenServer at least so you have it in a file somewhere and not in application.ex. Not sure what you’d gain from going the task or process method? Are you trying to reinvent a wheel or save some LOC? It certainly makes it more work to extend and hides the code away.

If you had oban as a dep already that would probably be the right choice.

2 Likes

Not sure what you’d gain

Well, he did say he was aiming at simplest, and I’d actually tend to agree that a module not importing a separately defined behavior is a bit simpler…at least in principle? I’m also interested in anything more substantial he might lose though by starting here.

For myself I usually start with Quantum unless I know I’m going to need persistence. I do always know I am going to need scheduling.

2 Likes

It certainly works like you did but I think it is but I is more idiomatic to keep the details out of the application file. The very least you could do is to put the child_spec/1 inside the MyApp.Sweeper module…

defmodule MyApp.Application do
    def start(_type, _args) do
        children = [
           ...,
            MyApp.Sweeper, 
           ...
        ]
        Supervisor.start_link(children, strategy: :one_for_one, name: MyApp.Supervisor)
    end
end

defmodule MyApp.Sweeper do
  def child_spec(_) do
     Supervisor.child_spec(
       {Task, fn -> MyApp.Sweeper.delete_old_data(sleep_ms) end},
         restart: :permanent
       )
   end
end

but really I think a standard GenServer + send_after / :timer.send_interval is the way to go for this… easier to read and understand :slight_smile:

defmodule MyApp.Sweeper do
   use GenServer
   def init([]) do
     :timer.send_interval(self(), :timer.hours(5), :delete_old_data)
     {:ok, []}
   end

  def handle_info(:delete_old_data, state) do
     # delete old data here, or spawn a Task doing it to keep the Sweeper responsive
   {:noreply, state}
  end

  def start_link([]) do
    GenServer.start_link(__MODULE__, [])
  end
end
4 Likes

seems a great fit.

2 Likes

came here to say this. Quantum is both reliable and readable. custom solutions may be reliable (thanks to supervisors), but they are very annoying to read a few months down the line.

1 Like

If you want the shortest possible out-of-the-box solution with no external dependency, there’s :timer.apply_interval.

Such solution should be no worse than a custom GenServer +send_after.

My preferred approach though is a GenServer which starts the task as a child process. It also has to be one GenServer per each periodic job, so I can inject each job in the proper place in a supervision tree. This is a big reason why I don’t like and avoid quantum & similar libs.

Instead I wrote my own periodic abstraction which follows the principles outlined above. I blogged a bit about it here.

6 Likes

I also took this approach. As a result I can pause, speedup and slowdown the interval by making calls to a jobs GenServer. When the Genserver goes down (by calling ‘halt’ or due to an error) the state is saved and used in recovery.

An improvement I envision is using a state machine. Can’t recall why but it had to do something with the interval and allowed transisions :slight_smile:

1 Like

Seconding @sasajuric and @BartOtten here, just roll your own GenServer per task – it’s just 10-20 coding lines of boilerplate maximum.

Or use Periodic. It’s a very small and functional thing (last I used it at least, which was like 3 years ago). Or you can rip out the code you need from it because again, it’s very small and works fine.

I would make a few GenServers though. It’s a one-time investment that can pay huge dividends. If you want to get slightly fancy maybe you can store “when was the last time task X ran” in a small file; though you mentioned that it’s not critical if a task gets executed a bit more rarely every now and then (when the app is rebooted) so if that’s indeed the case then it’s probably best to not bother with keeping state outside of memory.

3 Likes

didnt use any of these, but the cron-syntax maybe nice to have.

Jumping on here. What do we mean by “simple”? :grinning:

I’m using Quantum (shared by @Sebb above) for a process that I need to fire off every 24 hours. I don’t care if it has a response or not, and I gave the process a ripcord to alert me if it fails catastrophically. I initially began building a genserver for my task before I realized I was over-engineering for this particular case. Implementing Quantum took me slightly longer than it takes to type mix deps.get and hit return. Simple.

If I needed to maintain state or recover a user session in a meaningful way, I would definitely go with a “simple” GenServer implementation. But this entails more code to maintain and test. On the other hand, a GenServer may be more flexible in the long run if my organization adds more requirements to the task.

For me, the most important consideration for this kind of “how long is a piece of string” question is always: what’s the right tool for the job?

How much time do you have to implement it? Is the job this task performs actually important? Can you commit to maintaining it? Do I need crash protection? Will the application fail if this piece isn’t running reliably? Or is this job a nice-to-have bit that’s not going to inconvenience anyone if it doesn’t run every day?

Simple in one regard probably means tradeoffs in another, right?

1 Like

Wouldn’t the simplest case be to create a mix/release task and just execute it via Linux/Unix crontab? If your task outputs something to the console in case of errors and mail server at localhost is setup then you will also start to receive e-mails in case of unexpected errors. This all works pretty much out of the box. In addition - it’s also really easy to start these tasks manually from command line if there’s a need.

Of course this kind of approach is only usable if you are running Linux/Unix, you have access to the server and it doesn’t matter if previous task has not finished before launching a new task. Otherwise a locking mechanism needs to be implemented, but this is also pretty trivial by just creating some “lock” file when starting the task and deleting it in the end.

2 Likes