Have Phoenix display a maintenance page if database is down?

I’m running an application with a fairly simple Phoenix web UI, that enables registering requests which are then handled asynchronously by synchronizing status updates with another system.

However, when the PostgreSQL database is taken down temporarily for maintenance, the Ecto repo will of course be disconnected, causing multiple errors on incoming requests (as well as for the background status updates running in a GenServer). Eventually the application crashes entirely, after the supervisor restart limits are reached.

I’d like to avoid crashing, preferrably “pause” the background updates, and display some kind of “ongoing maintenance, please try later” page to the end users whenever the database is inaccessible.

Has anyone worked out a good approach for something like this? :slight_smile:

3 Likes

Such think is called circuit breaker and there are few implementations in BEAM world:

2 Likes

Right, and for eg. REST API usage, where I’ve created my own client code, I’d have a pretty clear idea about where to put a circuit breaker… But I’m not quite sure what to do about Phoenix :joy:

Perhaps the Endpoint would be a decent place, though.

1 Like

The beginning of the pipeline would be a good place for that.

1 Like

A super trivial example with ecto and pg would be:

defmodule DatabaseStatus do
  def check_status do
    case Ecto.Adapters.SQL.query(Repo, "SELECT 1") do
      {:ok, _} ->
        :ok

      _ ->
        :error
    end
  end
end

You can then use this in a plug that you add high in your pipeline:

defmodule DbPlug do
  def init(options), do: options

  def call(conn, _) do
    case Database.check_status() do
      :ok ->
        # render normal stuff
        conn

      :error ->
        # halt the conn and render the maintenance page or return a json payload or whatever.
    end
  end
end

This isn’t super efficient since you’re going to be making a lot of extra db calls. A more efficient way to do this would be to run a second process that functions as a watch dog. It can periodically call Database.check_status/0 and use aggregates to trigger the status back and forth so it doesn’t flap. For instance you might decide that 5 failing calls in 30 seconds means that the db is down. That status could be written into an ets table which could be read in parallel. This pattern would be much more efficient than pinging the db in each request. Having a secondary process also allows you to overwrite the status via an api call. For instance if you know you’re going to go into maintenance mode you could trigger the api as part of that process. But whatever optimizations you want to do the general idea still stands and should help get you started.

6 Likes

I have this exact issue at work when the oracle database goes down for its maintenance periods. I setup a failure handler in the fallback_controller to handle the general case that the oracle server is down, and for pages that don’t touch the oracle database they still work fine, and the ones that do but can handle it being down by doing something else just handle it their own way. I don’t touch the oracle database pre-controller in plugs at all so I don’t have worry about it there.

1 Like