DynamicSupervisor: Send message to itself after startup

Hey,

I got a module based DynamicSupervisor:

defmodule MyApp.ExampleSupervisor do
  use DynamicSupervisor

  def start_link(name) do
    IO.inspect name
    DynamicSupervisor.start_link(__MODULE__, name, name: String.to_atom("sup_example_#{name}"))
  end

  def init(name) do
    IO.inspect self()
    DynamicSupervisor.init(strategy: :one_for_one, extra_arguments: [name])
  end
end

I would like it to start children on its own so my idea was to use Process.send_after/3 in the init callback and use a handle_info callback. But that doesn’t work since the started supervisor process is not my own module but the DynamicSupervisor module itself.

1 Like

Maybe you can use send(self(), :start_children) in init (instead of send_after to avoid race conditions), and call DynamicSupervisor.start_child/2 in handle_info?

  1. I would need to use send_after so the init call gets executed first.
  2. The whole problem is, that my handle_info won’t get called because the process itself uses the raw DynamicSupervisor module instead of my own module.

Your message won’t be handled until init finishes anyway. A message arriving as you are processing something will always have to wait until you loop around and process the next message. All processes are single-threaded, so you will never suddenly process a message while something else is going on. send() is completely fine to use.

On top of that your process hasn’t even started processing messages yet until after init finishes.

1 Like

Ah, completely forgot about it.

Maybe I am approaching my problem in a wrong way. My usecase is to recover supervisor restarts in a way that if a supervisor starts, it should look somewhere else (GenServer, ets, whatever) if there are information about children that need to be started and just starts them on its own.

There are a few considerations here:

  1. A supervisor crashing would generally be the result of multiple crashes of its children. It’s not generally advisable unless you have good reason to replay potentially crappy states. If you save too much state for these processes and that ends up being what actually killed the supervisor from the beginning, you’re only creating a cascade of crashes that will ultimately kill the entire supervision tree as the supervisors of the supervisors start crashing fast enough.

  2. It can be an idea not put logic like this in your supervisors, but instead have a managing process (I prefer to have X.Supervisor and X.Manager for whatever X is) that would deal with the logic surrounding spawning, killing, otherwise managing whatever the thing is. What that would entail in this case is the supervisor sending an asynchronous message to a manager when it starts and that manager then starting the children.

Generally I’m wary of potentially replaying poisoned state, so much so that I’ve always felt that it wasn’t worth it.

Here is a sketch of what you could do, though:

Sandbox.SomeChild:

defmodule Sandbox.SomeChild do
  use GenServer, restart: :transient

  def start_link(name), do: GenServer.start_link(__MODULE__, [name])

  def name(pid), do: GenServer.call(pid, :name)

  def init([name]), do: {:ok, name}

  def handle_call(:name, _from, name), do: {:reply, name, name}
end

Sandbox.SomeChild.Supervisor:

defmodule Sandbox.SomeChild.Supervisor do
  require Logger
  use DynamicSupervisor

  def start_link([]), do: DynamicSupervisor.start_link(__MODULE__, [], name: __MODULE__)

  def start_child(supervisor_pid, name) do
    spec = {Sandbox.SomeChild, name}
    DynamicSupervisor.start_child(supervisor_pid, spec)
  end

  def init([]) do
    Sandbox.SomeChild.Manager.start_children(self())
    DynamicSupervisor.init(strategy: :one_for_one)
  end
end

Sandbox.SomeChild.Manager:

defmodule Sandbox.SomeChild.Manager do
  use GenServer

  def start_link([]), do: GenServer.start_link(__MODULE__, [], name: __MODULE__)

  def names(pid \\ __MODULE__), do: GenServer.call(pid, :names)

  def add_name(name, pid \\ __MODULE__), do: GenServer.cast(pid, {:add_name, name})

  def start_children(supervisor_pid, pid \\ __MODULE__) do
    GenServer.cast(pid, {:start_children, supervisor_pid})
  end

  def init([]), do: {:ok, []}

  def handle_call(:names, _from, names) do
    {:reply, names, names}
  end

  def handle_cast({:add_name, name}, names) do
    {:noreply, [name | names]}
  end

  def handle_cast({:start_children, supervisor_pid}, names) do
    Enum.each(names, &Sandbox.SomeChild.Supervisor.start_child(supervisor_pid, &1))
    {:noreply, names}
  end
end
iex(1)> Sandbox.SomeChild.Manager.start_link([])
{:ok, #PID<0.163.0>}
iex(2)> Sandbox.SomeChild.Manager.add_name("hej")
:ok
iex(3)> Sandbox.SomeChild.Supervisor.start_link([])
{:ok, #PID<0.166.0>}
iex(4)> [{_, pid, _, _}] = DynamicSupervisor.which_children(Sandbox.SomeChild.Supervisor)
[{:undefined, #PID<0.168.0>, :worker, [Sandbox.SomeChild]}]
iex(5)> Sandbox.SomeChild.name(pid)
"hej"
iex(6)> Sandbox.SomeChild.Manager.names()
["hej"]
iex(1)> Sandbox.SomeChild.Manager.start_link([])
{:ok, #PID<0.154.0>}
iex(2)> Sandbox.SomeChild.Manager.add_name("hej")
:ok
iex(3)> Sandbox.SomeChild.Manager.add_name("hopp")
:ok
iex(4)> Sandbox.SomeChild.Supervisor.start_link([])
{:ok, #PID<0.158.0>}
iex(5)> [{_, child1, _, _}, {_, child2, _, _}] = DynamicSupervisor.which_children(Sandbox.SomeChild.Supervisor)
[
  {:undefined, #PID<0.160.0>, :worker, [Sandbox.SomeChild]},
  {:undefined, #PID<0.161.0>, :worker, [Sandbox.SomeChild]}
]
iex(6)> Sandbox.SomeChild.Manager.names()
["hopp", "hej"]
iex(7)> [child1, child2] |> Enum.map(&Sandbox.SomeChild.name/1)
["hopp", "hej"]

Note that if there is anything fatally wrong about the state that’s being stored in the manager you’ll have essentially only set up a guarantee that everything is going to crash almost instantly, as long as that poisoned state is used early enough in the started childrens’ lifetime. Worst case scenario the poisoned state doesn’t kill the children fast enough and you just have bombs lying there in wait, but they don’t kill anything fast enough to have the system die, so you can’t rely on a bad system being shut down for safety.

6 Likes

Thanks for the very detailed answer! You may be right and I should trust in the OTP mechanics. I gonna play with your suggestion later and report back what worked for me :slight_smile:

2 Likes

Late to the party but having face this problem now what worked was to place both the DynamicSupervisor and the so Initialization GenServer in a supervision tree with a rest_for _one strategy. This way whenever the DynamicSupervisor starts or is restarted following a crash the initialization routine will run and populate the DynamicSupervisor with the initial set of children.

1 Like