Race condition when monitoring a process and storing pid in GenServer state?

thehunmonkgroup · March 20, 2018, 3:52am

I’ve got a GenServer that is monitoring several other processes. The :DOWN messages will arrive in a handle_info() callback, so it’s making sense to me to store the pids of the monitored processes in the GenServer’s state, then match against them in the handle_info() callback:

def handle_info({:DOWN, _ref, :process, pid, :normal}, _from, state) do
  pid1 = state.pid1
  pid2 = state.pid2
  case pid do
    ^pid1 ->
      # Do something
    ^pid2 ->
      # Do something else
  end
  {:noreply, state}
end

The monitoring will be activated in another handle_info() call:

def handle_info(:do_something, _from, state) do
  with {:ok, pid} <- something(state)
  do
    Process.monitor(pid)
    {:noreply, %{state | pid1: pid}}
  end
end

My question: is there the danger of a race condition where the pid will not be available in the state when the :DOWN message is received? I suspect not because of the message queue ordering, but I’m not 100% confident, not understanding how the guts of GenServer works in this case.

kokolegorille · March 20, 2018, 4:36am

I prefer to store ref on monitor, and pid on link…

ref = Process.monitor(pid)

Then I catch DOWN like this

{:DOWN, ref, _, _, _}

and EXIT like this

{:EXIT, pid, _reason}

I use monitor for consumer, and link for worker. The worker is created on consumer request, and GenServer link the worker, and monitor the consumer.

I store ref => pid as key/value inside an ets table, but state is fine too…

I detect when consumer is going down, and I detect when worker exit. I can handle each case separately. I also update ets table state here. The Genserver acts as a middleman between consumer and worker.

I don’t think there is any danger of race condition on the BEAM, because processes mailboxes are sequential.

Qqwy · March 20, 2018, 5:11am

There is no race condition here. According to the Erlang documentation (scroll down to 12.8), a monitor will immediately send a :DOWN-message when the process you try to monitor has already quit before attempting to monitor it.

rvirding · March 20, 2018, 12:17pm

And because of the way that processes in Erlang/Elixir work* you are guaranteed that the handle_info/2 callback which is starting the process and monitoring it will complete before the handle_info/2 callback is called which processes the :DOWN message.

Storing the monitor on the returned ref as suggested by @kokolegorille is quite smart as you can call Process.monitor multiple times on a process and you will get a :DOWN message for each one. The only thing which differentiates the messages is the ref element which will be different.

[*] Processes are simple and only have a single thread of execution, no internal parallelism or interrupts or things like. Really KISS all the way down.

thehunmonkgroup · March 20, 2018, 4:15pm

Thanks everyone, really appreciate the suggestions and clarifications, and I switched to using ref as @kokolegorille suggested.