Why are module-based Supervisors recommended (outside of the tree root)?

chocolatedonut · November 1, 2023, 2:23pm

Can you please provide an example of how and when is a module-based supervisor preferred (over starting a Supervisor via start_link/2)?

It isn’t clear to me what do “automatically” vs. “manually”, from the docs below, mean.

The difference between the two approaches is that a module-based supervisor gives you more direct control over how the supervisor is initialized. Instead of calling Supervisor.start_link/2 with a list of child specifications that are automatically initialized, we manually initialize the children by calling Supervisor.init/2 inside its c:init/1 callback.

I’ll use the Blackjack example, from @sasajuric’s To spawn or not to spawn post, where this call of Supervisor.start_link/2 doesn’t (seem to) follow the above recommendation.

For example, before this Supervisor.start_link/2 call I can still fetch children’s initial arguments from an external service – just as I can do in Supervisor.c:init/1. What is “automatic” vs. “manual” then referring to (as I presume my example isn’t it)?

Bit further down, the docs also say:

A general guideline is to use the supervisor without a callback module only at the top of your supervision tree, generally in the Application.start/2 callback. We recommend using module-based supervisors for any other supervisor in your application, so they can run as a child of another supervisor in the tree.

As shown in the Blackjack example, it seems we can still “run [a Supervisor] as a child of another supervisor in the tree”. What OTP benefits are we missing out on then?

Or has OTP changed since then, as Sasa’s article was written at Elixir v1.4.2, and the docs on module-based Supervisor at v1.4.2 doesn’t have the above-quoted recommendations.

josevalim · November 1, 2023, 3:31pm

I think those are unclear words from the docs. Perhaps “explicitly” vs “implicitly” is better. Both have to call Supervisor.init, but Supervisor.start_link does it for you implicitly, the other requires you to do so explicitly.

sasajuric · November 1, 2023, 3:36pm

I didn’t write those docs, but here’s my guess:

When you invoke Supervisor.start_link(children, opts), a process is started, and that process will start the children.
When you invoke Supervisor.start_link(module, arg, opts), a process is started, and that process will invoke your init callback. That callback needs to return the list of children (using Supervisor.init/2).

Hope this sheds some light, though I agree that manual vs auto is somewhat confusing.

When it comes to differences between these two approaches, I regard the former as being simpler, while the latter (with the callback module) is more flexible.

The reason is that when you have the callback module, the decision making is deferred to the latest moment in time. You generate the child specs just before the processes are about to be started. This can be useful if the restarted supervisor might need to start a different set of children.

You can’t achieve such flexibility with the basic approach. Because, even if the supervisor process is restarted, the spec has already been given, so it will always start the same set of children. You’d need to restart the parent of that supervisor to make that work.

For the same reason, I also think (but I’m not sure), that a callback-based supervisor will work better with live code upgrades.

I typically don’t care about these benefits, so I use the simpler approach for all my supervisors. The same approach is taken by Elixir in Action. TBH, I can’t recall the last time I wrote a callback-based supervisor

josevalim · November 1, 2023, 3:41pm

I agree with you. Unfortunately, outside of the root, the only way to rely on start_link/2 is by providing custom child specs (with the confusing start MFA). So I have found that telling people to use a module-based approach (which its default child spec) is better than hand-rolling a child spec. Thoughts?

sasajuric · November 1, 2023, 3:58pm

FWIW, EiA example use a dedicated module for each supervisor, it’s just not based on the callback app, so you do something like:

defmodule MySupervisor do
  def start_link, do: Supervisor.start_link(children, opts)

  def child_spec, 
    do: %{id: __MODULE__, start: {__MODULE__, :start_link, []}, type: :supervisor}
end

But I agree this is meh, so in practice I tend to write an ad-hoc helper so I can specify something like supervisor(children, opts) as a child. Perhaps it would be nice if we could get something like that out of the box?

But even with that, sometimes a dedicated supervisor module is useful (if you want to support other functions, such as adding a child). So it would be also nice if I could do something like use Supervisor, callback?: false which would inject the proper spec, without the behaviour part.

josevalim · November 1, 2023, 5:45pm

But then, if it is a choice between writing a custom child_spec function vs a custom init function, should we aim for the second, as it at least has other use cases?

sasajuric · November 1, 2023, 6:33pm

I chose the former approach for EiA (and in general), because I find it easier to explain. Start the supervisor process with these children. And this other function specifies how a process powered by this module can be started.

But as I said, most often these supervisors don’t have logic, so it would be nice if we could start them without needing a separate module or a child spec. From what I can tell, there’s no Supervisor.child_spec/1, so it seems that something like {Supervisor, id: id, children: ..., strategy: ...} could be supported. WDYT?

I find passing an arg to init which then returns the list of children (usually ignoring the arg) more convoluted. I understand it’s more flexible, but as I said, I can’t recall the last time I needed that.

christhekeele · November 1, 2023, 7:16pm

José and Saša do a good job of highlighting the OTP runtime differences above, being able to do things like dynamic child generation upon init.

I would also point out the general abstraction benefits of using module-backed OTP things, and not just supervisors. start_link/2 is the general (default child_spec) interface to put anything into a supervision tree. A module implementing that alone can let you change the backing OTP abstraction at will as your project evolves.

This is especially powerful if you keep all your interface functions in the implementing module, as well. For example, if you avoid scattering Supervisor.child_spec, Supervisor.start_link, and Supervisor.count_children throughout your codebase, but call out to them from the module’s interface alongside its callbacks, you can swap out Supervisor for any other OTP abstraction while only modifying a single file.

Some recent examples where I have done this: converting an Agent to a GenServer, converting a Task.Supervisor to a DynamicSupervisor, and changing a Registry to a Phoenix.Tracker.