Spawning temporary supervisors

I have a problem when I want to supervise user-defined jobs.
Jobs have steps and between steps, I am saving the last known finished state.
In the case of a node restart, I am reading that info from DB and resume.

I started by using DynamicSupervisor with Task for each user-specified job.
The problem is that when the job is ill-defined. It can fail immediately.
I want to restart it three times and after that give up.

However, I don’t want to kill the entire DynamicSupervisor with it because other tasks are perfectly fine.

I decided, I’ll change the hierarchy.

`- JobSupervisor (one_for_one, restart: :temporary) (don't restart it if it crashes; crash means that Worker crashed quickly in succession so we want to give up)
   `- ActualWorker (restart: :transient) (restart if it fails)

That solves one problem because now I don’t crash DynamicSupervisor and restarts work fine. But when the job finishes normally, I am left with dangling JobSupervisor that has nothing to supervise but didn’t crash.

Is there an elegant solution for spawning a supervisor that finishes with its last finished child?

Maybe check out alternative supervisor implementations like @sasajuric Parent, supervisor2 or director. All of them should provide you with finer grade control over the supervisor process.

The upcoming 0.11 version of Parent could indeed be used for this.

The untested sketch would look something like:

defmodule Job do
  use Parent.GenServer, restart: :temporary

  def start_link(arg), do: Parent.GenServer.start_link(arg)

  @impl GenServer
  def init(arg) do
    {:ok, _pid} = 
        id: :job, 
        restart: :transient, 
        ephemeral?: true, 
        start: mfa_or_zero_arity_fun

    {:ok, initial_state}

  @impl Parent.GenServer
  def handle_stopped_children(%{job: _}, state), do: {:stop, :normal, state}

See docs for more details, and let me know if you have some questions.

If Parent wasn’t available, I’d develop the same thing manually. Basically I’d turn JobSupervisor into a GenServer named Job, trap exits, and start a Task process as a child, handling :EXIT messages and manually calculating number of restarts.

I’ve created parent after being fed up with having to do this manually again and again :slight_smile:

Thanks, I’ll check it out!

For now, I am passing PID of JobSupervisor to the job and call Supervisor.stop(supervisor_pid) at the end.
Parent seems more elegant though, so I might refactor it later.

