How to make a Supervisor **never** die?

Can you provide a link to your talk? I would really like to hear your opinion on this!

8 Likes

It isn’t always practical to introduce a delay, but in cases where it is, I’ve used the following trick to allow a failing worker to “keep trying indefinitely” without hitting max restart intensity. In my worker, I use Process.send_after or :timer.sleep() to introduce a delay before executing the code that might fail. If the delay is greater than the max_seconds option you passed to Supervisor.start_link/2, then even if the worker fails repeatedly, it won’t fail frequently enough to exceed max restart intensity. It’s not elegant, but it is simple. Obviously this only suits certain cases, often it won’t be acceptable to introduce delays.

1 Like

There is also supervisor3 from Klarna (based on RabbitMQ’s version) if you are willing to use Erlang.

supervisor3 is capable to do delayed infinite restarts.

From the README: "Child specifications can contain, as the restart type, a tuple {permanent, Delay}"

For example the Kafka client brod uses it here.

4 Likes

Save a ton of my times. Thank you very much for this hack.
@brucepomeroy