Hey there,
Is there a way to configure a Supervisor
such that when a child terminates, it will wait a random amount of time before restarting it?
I am trying to make a realistic simulation in which GenServers
randomly fail and come back. At the moment, restarts always seem to be instant but in a real world this is not always the case. One thought was to add a random sleep
in the start_link
function of my GenServers
. However, this would make my whole Supervisor
sleep (synchronous), which is also undesirable.
I’m looking forward to your insights!
If you need delays between restarts I’d suggest managing that separate to the supervision tree/restarts of supervisors.
Supervisor restarts are meant to keep your system in a working state, not in a partially down state. A parent supervisor does attempt to restart the child, but that’s the only mechanism it has to keep the system up. If restarts don’t help the failure will propagate up the supervision tree.
This is not a tool to handle temp. outages, where waiting would also be an acceptable attempt at restoring working state. You’d want to encode handling such failures into your own business logic / use alternative supervisor implementations.
Built in supervisor? No, but there are other implementations that allow for that:
1 Like
You can add a handle_continue clause in your genserver and sleep there
Something like
def init() do
...
{:ok, state, {:continue, :random_sleep}}
end
def handle_continue(:random_sleep, state) do
Process.sleep(:rand.uniform(1, 112345))
{:noreply, state}
end
This won’t block the supervisor
And yeah, in the real world restart is kinda instant too