Should we catch stray messages in a GenServer?

Catching stray messages which may fill up your message inbox is considered a good practice in erlang. Do we need to catch them even in a GenServer?

defmodule Ticker do
  use GenServer

  def handle_info(:tick, state) do
    schedule_tick(state)# this makes sure that we catch up if a work is delayed
    work(state)
    {:noreply, state}
  end

  # to catch stray messages?
  require Logger
  def handle_info(oops, state) do
    Logger.warn("UNEXPECTED_EVENT_IN_TICKER: #{inspect oops}")
    {:noreply, state}
  end

end
3 Likes

I just looked at the code https://github.com/elixir-lang/elixir/blob/master/lib/elixir/lib/gen_server.ex#L565 and there is a default implementation which should catch all messages, However as soon as we define a handle_info the default function should be overridden. So, I guess this is needed to catch stray messages.

1 Like

Yes you should define it. And your hunch is good, for exactly the reason you mention.

Elixir’s intro docs on GenServer make it clear in this section:

3 Likes

@minhajuddin yes you’re right. If you override handle_info you will need to have a catch all implementation as well, otherwise as you stated the messages will not be removed from the inbox.

1 Like

I just looked through the Elixir documentation on GenServer.handle_info, and it is not mentioned there yet. I think this would be a great clarification. I’ll open a Pull Request :slight_smile: .

I made a PR and @josevalim was very quick to reply:

The GenServer always remove the message. If you don’t implement the callback, it means the GenServer will fail for unknown messages, which may actually be desired. So if you don’t want to define a catch-all clause, that’s ok.

So it seems that we arrived at an incorrect conclusion before.

2 Likes

Thats interesting, I didn’t realise that. Has this always been the case do you know? I based my answer from what I read in Elixir in Action and wonder if it’s just where the book as fallen out of date.

edit: actually scrap that. I just flicked back through the book to double check. It appears my memory is awful :icon_sad: it clearly states the gen server would crash if you don’t supply the match all override

1 Like

Thanks for verifying it :slight_smile:

I do think the conclusion was correct (for the most part). You probably don’t want your GenServers to crash at random.

The fact that it will crash sooner than later is other story. But the result is the same, crashing and unstable GenServer. You most likely want to do catch all.

1 Like

I would not consider a crash when an unexpected message is received a ‘random’ crash (in contrast to an ‘inbox is full’ crash): It is a fail-fast indication that you are passing messages around that cannot be understood by the receiver. As José astutely pointed out, this is useful in certain situations.

Then again, most messages we use with GenServers are of course cast/calls, which do not use handle_info at all.

1 Like

If you can’t explain, for your particular application, why you want to crash (and restart into a known state) when an unexpected message is received, then IMHO, you should have that catch all handle_info function that at least logs what it got. When I was first learning OTP, I got confused at messages I did and did not receive until it all clicked, and it’s only made worse if the GenServer is restarting in the middle of it all.

I think you should have the catch all handle_info until you can explain why you do not want it.

1 Like

Absolutely agree! I can’t recall a single time I didn’t write the catch-all handle_info. In contrast, I don’t think I ever wrote catch-all for handle_cast and handle_call.

5 Likes

I disagree with the consensus here. I don’t think there should be a catch all clause by default. I would like the handle_info/2 to crash in the default implementation just like handle_call/3 and handle_cast/2. Rather an explanation is required why you need the catch all - but if there is a catch all you must log. The reason we don’t crash by default in handle_info/2 is because we used to silently ignore it and didn’t log (there wasn’t a Logger).

Messages that come to handle_info/2 are likely from actions triggered by the process itself. For example monitor or port messages. In these situations you may want to disregard unhandled messages for old ports/monitors. However if the process is receiving monitor messages and the handle_info/2 clause catches all of :DOWN I would not write the clause because not expecting other messages. It would be overly defensive programming and there is a bug so want to crash.

2 Likes

The one thing I still wanted to point out is that since Eliixr 1.4, the default catch-all clause will be doing the logging. And I think @fishcakez you were the one who put that stuff in ;).

My point is, the behavior where you have a GenServer, you use it in the project. And then you decide to catch a message since you expect one. And this change in behavior causes you to discover a bugs that you did have already but messages were ignored. I just think it’s inconsistent.

Any ideas why the default implementation is not crashing GenServer?

1 Like

It would be too big a change to go from silently ignoring to crashing. Here is link to discussion: Redirecting to Google Groups

2 Likes

So it is now a year and a half later…

I wonder: Would this kind of change be the sort of default that Elixir 2.0 would do differently?

Given the handle_info callback was made optional in OTP 20 and logs when some unexpected message arrives, I think the path Elixir took was right.

2 Likes