Parent - custom parenting of processes

Hi folks,

I’ve written a small experimental library motivated by a couple of scenarios I’ve encountered in production. The library is tentatively called parent, and it aims to help with scenarios where a GenServers is used to directly parent children.

At this moment this is an exploration, so it’s not published on hex. I’ve written in more details about the library, motivation, and example scenarios, and I’m interested in any sort of feedback. See GH repo for more details.

Looking forward to hear your thoughts!

12 Likes

I’d definitely appreciate a library like this to simplify code that manages child processes.

When should I put some Supervisor-like capabilities in a GenServer,
vs creating a custom Supervisor, eg ConsumerSupervisor ?

That’s a very good question!

I think that putting supervisor-like capabilities in a GenServer is essentially not much more different than creating a custom supervisor. They both yield a module which is a GenServer (or GenStage, gen_statem, …) and acts as a parent of its children.

However, the former is a hardcoded solution, so I think it’s more appropriate for unique scenarios, whereas the latter would be a better option when you need it in multiple places.

So taking the examples from the project readme, if you wanted to provide a generic abstraction for periodic execution, you might make it a supervisor, so you could do start it like Periodic.start_link(child_spec, opts), where Periodic is a supervisor, child_spec is the spec of a job, while opts is a data-driven interface to control periodic execution.

It’s worth noting that the main thing that makes supervisor special is the type: :supervisor in the child spec. As far as I know, this field is only used by the release handler when doing code reloading. When the release handler wants to determine the process hierarchy, it starts with the top process, and recursively goes deeper for any process which is marked as a supervisor (with type: :supervisor). For every supervisor, the release handler will ask it for the list of its children.

Therefore, to make the custom supervisor work with code reloading, the module needs to handle Supervisor (or more precisely Erlang’s :supervisor) specific messages, such as :which_children, and return the result in the same shape. This is somewhat hacky, so I advise caution with going there. I think that in many cases a custom supervisor can be also implemented with two standard supervisors and a GenServer (or any other desired behaviour).

1 Like

I love it! I’ve also encountered situations where I’d write code like this by hand, and now I realise it was less robust than I thought (e.g. I didn’t realise that killing a process does not take down any children it start_linked automatically), so it’s good to have a library out for that. As you said, this is one of those “know it when you need it” types of libraries, rather than something which should be a default for supervising things.

2 Likes

It depends on how you kill it. A normal signal will not kill linked processes and if the linked process traps exists it depends on how they are handled. Sending a kill signal will always kill linked processes.

3 Likes

When you take down the parent with a non normal reason, a linked child will usually be stopped too. However, there are some exceptions, as pointed out by @cmkarlsson. In addition, there is a slight ordering problem. If the parent is not explicitly taking down its children, a child might linger on for a bit longer before it’s taken down.

So it’s not completely guaranteed that when the parent stops all of its descendants are already down. This can lead to some strange race conditions, which are admittedly not very likely to happen, but are still possible.

IMO, a good approach to building an OTP supervision tree would be as follows:

  1. Every parent is a supervisor.
  2. A child which is a supervisor has the :shutdown option set to :infinity (this is the default for supervisors).
  3. A supervisor process (i.e. a parent) is only taken down through its own parent.

Such approach guarantees that a parent process terminates only after all of its descendants are down. I believe that this is a clean approach which completely eliminates some possible race conditions.

When you’re manually parenting children, you can ensure the same in the terminate/1 callback, but it will require some work, and you need to remember to do it. I’ve just browsed through some of our code, and noticed that explicit children termination is not done, probably because I forgot to implement it when I wrote the original code :frowning_face:

5 Likes

I added some docs, and pushed the library to hex:

The library also includes a lightweight scheduler for periodic jobs, which provides finer-grained control with respect to OTP supervision trees and requires no app env based configuration.

We’ve recently started using the library in our project. It’s still early days, but so far it looks good.

3 Likes

Released the version 0.7.0 with various improvements in periodic job scheduler.

3 Likes

I have just spotted it, but are there any reasons why not use Director?

3 Likes

This is the first time I’ve heard of this library, thanks for mentioning it! Obviously, the main reason why I didn’t use it is because I didn’t know about its existence at the time I wrote Parent :slight_smile:

Let me first briefly summarize Parent’s intention. It’s basically a GenServer-like behaviour where callback code can do regular GenServer stuff (handle calls, cast, infos), as well as start/stop children dynamically and react to their termination. The behaviour itself also takes over the supervisor roles, ensuring proper child termination, and presenting itself to the outer world as a supervisor (so any logic traversing the supervision tree would also travers Parent’s children).

In other words, Parent is basically a fusion between Supervisor and GenServer. In theory you could reimplement Supervisor on top of parent, though I’m not suggesting doing that.

Director seems to share some similar goals, but looking at the callback spec, the GenServer part is missing, so it seems that director can only be controlled externally (from outer processes). If that’s indeed the case, it wouldn’t be fit for any of the scenarios for which I wrote Parent (all of which are mentioned in the rationale doc). For example, I couldn’t write Periodic the way it is written now, because it is based on internal handling of send_after messages.

Beyond that, at first glance Director seems packed with a bunch of other features, such as managing children of other processes, and custom ETS or Mnesia based registry. Parent is deliberately designed with a small feature set to keep it easy to reason about. By saying that Parent is a GenServer-like behaviour which has some supervisor roles, we’ve basically explained the gist of the lib in terms of regular OTP parlance. I don’t expect a seasoned OTP developer should have to dive into the code to understand what the behaviour does. Such design keeps Parent simple, and at the same time very flexible, since you can implement arbitrary behaviour on top of Parent.

It’s worth repeating that this is the first time I’ve heard of Director, so obviously I’m not familiar with how it works, so take my comments with a grain of salt :slight_smile:

2 Likes

Wrote a blog post which presents Periodic in more details. Hope you’ll enjoy it!

6 Likes