I have a simple supervision tree with one supervisor and three workers. One
worker manages the GUI (a wrapper around the cecho Erlang library), one
worker is a controller that gets key presses, runs code, and tells the GUI
to update, and the remaining worker does the work.
How do I stop this thing? I want the controller to be able to say “OK, we’re
done” and have each of the workers and the supervisor shut down normally and
call some cleanup code so that my application can exit.
All of the excellent books that I have talk about supervisors keeping
workers running and about workers dying and being restarted, but I can’t
seem to figure out how to have everybody gracefully quit and clean up.
I have tried pretty much every combination of the following that I can think of:
Calling MySupervisor.stop
exit(:normal) and exit(:terminate), from the parent of the supervisor or from the supervisor
Casting a :quit message that returns {:stop, :normal} or {:stop, :terminate} for each of the children (normal gets restarted, terminate complains but
doesn’t stop the parent supervisor)
It works! Thanks, aeden! It took a few more changes to get it working, but
that did it. The remainder of this message describes the “few more changes”:
I had previously tried calling Supervisor.stop with different args to no
avail. When I gave the supervisor a name and called tried Supervisor.stop(name), I then saw “Application jex exited: normal” but
the top-level app that started the supervisor didn’t quit. I realized that
was because I started the app using --no-halt. I used that because
otherwise my app would quit right after creating the supervisor.
So I tried this:
After starting the supervisor, the top-level app waits for a :quit
message in a receive loop
The quitting code sends that message to the top-level app after calling Supervisor.stop(name) as you suggested
+1 for :init.stop! This is a standard approach to politely take down all applications and the entire system, and it doesn’t require any improvisation such custom :quit message. You can then also safely use --no-halt or OTP releases and still be able to stop the entire system.
A follow-up: :init.stop works much better. I still explicitly send
messages to my workers so they can clean up properly, but then a simple
call to :init.stop works as advertised.
It is my understanding that manually shutting down workers should be unnecessary. If you do the requisite setup required to have the terminate callback run (Process.flag(:trap_exit, true)) and then all you need to do is call :init.stop.
Yes, :init.stop did the trick. The controller (listening for keyboard input) responds to the quit key by calling a function in my supervisor that lets everybody clean up then calls :init.stop:
def quit do
GUI.cleanup
Metronome.stop
MIDI.cleanup
:init.stop
end