How to handle GenServer invalid start state

I’m creating a GenServer that in init/1 reads some initial state from the Database. If the database entry is invalid or missing then the GenServer is completely invalid. Initially I was handling it by returning {:error, "Invalid intial state} from the init/1 function but I’m running into trouble since that causes my application to not start up at all (which causes trouble running the seeds file among other things).

My ideal behavior is that the GenServer would not successfully startup but it also wouldn’t bring the entire system down (right now it is bringing the supervisor down as well). Is that reasonable behavior? Or do I need to allow the GenServer to come up but understand that it is in an invalid state and always return an error to all of it’s callers?

You could attempt to quite the genserver normally with something like

{:stop, reason, new_state}

You might want to delegate worker restart to another GenServer. This would avoid the system to go down.

defmodule GameEngine.Games.Worker do
  @moduledoc false

  use GenServer, restart: :temporary
  ...
end

And in a linked Process traping exit

  # Workers trap EXIT
  @impl GenServer
  def handle_info(
        {:EXIT, pid, reason},
        %{worker_sup: worker_sup} = state
      ) do
    log("#{@name} catched EXIT #{inspect(reason)}")

    case :ets.lookup(__MODULE__, pid) do
      [{pid, name, sender, receiver}] ->
        true = :ets.delete(__MODULE__, pid)

        # Do not restart worker if normal or timeout!
        if reason in [:normal, {:shutdown, :timeout}] do
          notify(%{type: :game_stopped, payload: name})
        else
          log("Restarting : #{name}")

          with {:ok, worker} <-
                 start_worker(worker_sup, %{uuid: name, sender: sender, receiver: receiver}) do
            true = :ets.insert(__MODULE__, {worker, name, sender, receiver})
            {:reply, {:ok, worker}, state}
          else
            {:error, reason} ->
              log("Could not restart : #{name} #{inspect(reason)}")
          end
        end

      [] ->
        true
    end

    {:noreply, state}
  end

  defp start_worker(sup, %{uuid: name} = args) do
    spec = %{
      id: name,
      start: {Worker, :start_link, [args]},
      restart: :temporary,
      type: :worker
    }

    case DynamicSupervisor.start_child(sup, spec) do
      {:ok, worker} ->
        Process.link(worker)
        notify(%{type: :game_created, payload: Worker.get_state(worker)})
        {:ok, worker}

      {:error, reason} ->
        {:error, reason}
    end
  end

In this case, You can adapt the response to the stop reason. This way, it is not the supervisor who is in charge, but a GenServer. You can have an error, the server might die, but it won’t take your system down.

Or for your case, You could return :error, and You can catch it when it happens.

Have a look at It’s About the Guarantees. Generally I find the second approach (return error to callers) more useful because the process is still up, so it can periodically try to load its state and recover on its own once you fix the DB entry. But I don’t know if that makes sense for your use case.

If you don’t want the process to start at all you can make your init function return :ignore rather than an error tuple. It’s documented here. Then the supervisor won’t try to restart it.

3 Likes

Yeah I guess it makes sense that the GenServer should remain up and try to periodically fetch it’s state from the DB. I’m just not sure how to cleanly represent whether the GenServer’s state is currently valid. I guess I could add a :valid? key to the state. Also it would have to be checked on all operations which is a little annoying.

Instead of checking it on every operation you could put at the topmost a handle_call, handle_cast and handle_info that pattern match on %{valid: false} = state (you can inclusively match on terminate as well), this gives you a clean way of writing what should happen when a cast, call and msg arrive and the genserver is in an invalid state (which you probably would want to write anyway). You just need to make sure that :valid is always written on init and updated accordingly throughout the lifecycle of the genserv.

1 Like

The state doesn’t have to be a map, it can be just the atom :invalid for instance, or a tuple. Makes pattern matching very easy.

1 Like