Signal or hook at the end of starting a release

Hi everyone,

A while ago we had an issue with one of our production systems where the application was stuck in some kind of startup process. This meant some processes were started and others not. Our monitoring didn’t fully pick this up. So now we’re looking for a way that we know that a release is fully started.

We’re using mix release to build the packages and deploy them on servers where they’re managed by systemd.

Initially I looked at the BEAM, and did find :init.get_status/0. So I was thinking to use an rpc to check that in intervals, but I’d rather have something nicer.

Someone in the Elixir slack channel shared erlang-systemd with me. This looks really promising, but we’re using a umbrella application, and make multiple different releases with different sub-applications. So this makes it a bit hard to determine in which application Supervisor to put the ready signal. A solution to this would be to create a new application that just does this ready signal and make sure to include that application as the last started application of each different release. But that seems like a lot of overhead to me.

Ideally what I think I’m looking for is some kind of option on the boot script or the BEAM to call some kind of (systemd?) hook or a signal that is broadcasted when startup is completely done.

I’m interested in hearing more options or opinions on this!

This is pretty close to the K8s approach of liveliness vs readiness. When our app boots we have the Phoenix Endpoint child very early in the tree, but then we have two distinct health checks, /alive and /ready which maps to this code:

  def alive(conn, _params) do
    json(conn, %{alive: true})
  end

  def ready(conn, _params) do
    if Application.get_env(:my_app, :ready) do
      json(conn, %{ready: true})
    else
      send_resp(conn, 503, "")
    end
  end

The default value of ready is false in our config. Then at the bottom of our supervision tree we have a simple genserver which sets Application.put_env(:my_app, :ready, true), and also can message out somewhere if you want it to.

So at that point once that child runs GET /ready returns 200, and you know your app is fully booted. If you want that pushed somewhere then your genserver can do so. Handily if you trap exits this also gives you a nice flow on node shutdown, because that same genserver is now the first child to shutdown, so you can set /ready 503 and again notify out if you want.

2 Likes

I wouldn‘t say that‘s much overhead. Having all other apps as optional dependencies would make that extra app start after all of those dependencies if they are included in the release.