Supervisor to DynamicSupervisor migration

Hi buddies,

in my project I was using Supervisor to start and terminate+delete children dynamically. Elixir complained I should be using a DynamicSupervisor:

warning: Supervisor.terminate_child/2 with a PID is deprecated, please use DynamicSupervisor instead
  (elixir 1.16.1) lib/supervisor.ex:1030: Supervisor.terminate_child/2

So I switched to that. Unfortunately the doc for DynamicSupervisor points out that the id of the child_spec, while required, is ignored:

Note that while the :id field is still required in the spec, the value is ignored and therefore does not need to be unique.

The effect of this is the following: with Supervisor.which_children I was able to see the child spec id of the child along with its pid:

iex(53)> Supervisor.which_children(MySup)
[
  {"foo_child-0012041214", #PID<0.509.0>, :worker, [ChildModule]}
]

Instead, with DynamicSupervisor, which_children api returns :undefined as the first element of the tuple:

iex(31)> DynamicSupervisor.which_children(MySup)
[{:undefined, #PID<0.270.0>, :worker, [ChildModule]}]

The problem is that I need to command the termination of children via external APIs, which don’t of course have notion of pids :slight_smile: .

With Supervisor I could directly read the child spec id. Do I now have to explicitly instrument the Registry boilerplate? is there some idiomatic shortcut to obtain a similar result?

Thanks

Bit rusty on this as I have been on phoenix, by dynamic supervisors have a changing pool of workers, so having names for them might be a bit tricky without some sort of registry that can be used to identify the child by composite ID (usually the child module plus name an identifying index).

Should you require more clarity I can quickly refer back to appropriate material.

I am not sure what’s the problem here? I have written a worker + a DynamicSupervisor implementation that successfully commands children to stop, by user-given name.

The trick was to use a Registry for name registration. Oh, and you can’t rely on Process.whereis and DynamicSupervisor.which_children for exact lookups, you have to use GenServer.whereis.

Here’s what I have in files lying around, and I just tested it. Obviously change YYY to your app namespace.

The worker:

# one_worker.ex

defmodule YYY.OneWorker do
  use GenServer, restart: :transient

  # Only restart if it exits abnormally; otherwise we'll be stopping these manually.
  def start_link(request_id) do
    GenServer.start_link(__MODULE__, request_id, name: via_tuple(request_id))
  end

  def child_spec(request_id) do
    %{
      id: __MODULE__,
      start: {__MODULE__, :start_link, [request_id]},
      restart: :transient
    }
  end

  def stop(request_id, stop_reason \\ :normal) do
    # Given the :transient option in the child spec, the GenServer will restart
    # if any reason other than `:normal` is given.
    request_id |> via_tuple() |> GenServer.stop(stop_reason)
  end

  def ensure_started(request_id) do
    case YYY.OneDynamicSupervisor.start_child(request_id) do
      {:ok, _pid} -> :ok
      {:error, {:already_started, _pid}} -> :ok
      other -> raise other
    end
  end

  def ping(request_id, text) do
    ensure_started(request_id)
    request_id |> via_tuple() |> GenServer.call({:ping, text})
  end

  @impl GenServer
  def init(initial_state) do
    IO.puts("initial_state=#{initial_state}")
    {:ok, initial_state}
  end

  @impl GenServer
  def handle_call({:ping, text}, _from, state) do
    response = "#{inspect(state)}: #{text}"
    IO.puts(response)
    {:reply, response, state}
  end

  defp via_tuple(request_id) do
    {:via, Registry, {:one_registry, request_id}}
  end
end

The dynamic supervisor:

# one_dynamic_supervisor.ex

defmodule YYY.OneDynamicSupervisor do
  use DynamicSupervisor

  def start_link(init_arg) do
    DynamicSupervisor.start_link(__MODULE__, init_arg, name: __MODULE__)
  end

  def start_child(request_id) do
    # Shorthand to retrieve the child specification from the `child_spec/1` method of the given module.
    child_spec = {YYY.OneWorker, request_id}

    DynamicSupervisor.start_child(__MODULE__, child_spec)
  end

  @impl DynamicSupervisor
  def init(_init_arg) do
    DynamicSupervisor.init(strategy: :one_for_one)
  end
end

Also make sure to add those children to your Application.start return value (i.e. the children of your app):

  • YYY.OneDynamicSupervisor
  • {Registry, [keys: :unique, name: :one_registry]}

You can see that I have included a function to stop a child (YYY.OneWorker.stop/1). It uses a :via tuple to locate and stop it. I have not included a function to locate a child’s PID but it’s as simple as this:

def whereis(request_id) do
  {:via, Registry, {:one_registry, request_id}}
  |> GenServer.whereis()
end

That should solve your problem.


Caveats and remarks:

  • The YYY. namespace should be changed to your app’s;
  • The OneWorker and OneDynamicSupervisor names are placeholders, IMO change them before doing a GIT commit in your repo;
  • The :one_registry name of the registry should be changed;
  • Remove the IO.puts calls, I’ve put them just for demonstration;
  • request_id is simply your user-supplied ID / name, feel free to change it to anything else (and it can be anything else besides an integer as well).
4 Likes

Thank you, @dimitarvp .

I think my implementation was originally similar to yours. The problem I had with that (and possibly the reason why I posted the question), is that invoking YYY.OneWorker.stop, would in facts stop the worker, but than this would be restarted right after…

So I came to the conclusion that I should have passed through the DynamicSupervisor terminate_child API to stop the genserver. This seemed to work, provided that I had a way to find the “pid” to give as argument to the terminate_child.
Your approach is to instead directly stop the child process and use the option

restart: :transient

I hadn’t found this option in the DynamicSupervisor doc so I completely missed it. Using this option solves the restart problem.

At this point I can also avoid the usage of Registry at all and just use the {:global, term()} to later on stop the process.

Do you know why the :transient option has to be provided in either use Genserver declaration and in the child spec?

Transient is a restart strategy that restarts the process only when the process crashes abnormally:

:transient - the child process is restarted only if it terminates abnormally, i.e., with an exit reason other than :normal, :shutdown, or {:shutdown, term}.

3 Likes

Because I am paranoid. Technically it should work if it’s only in either place. But don’t quote me on that, I say just try it.

1 Like

The default for the GenServer is defined in use. The supervisor might override it.

1 Like