Find out how many unhandled messages a process received

Background

I found out about Process.info which gives you information about a running process, such as the number of reductions and so on. However I need to know the number of unhandled messages a process has received during its life time.

Question

Is there a way to find this out, without me having to hard code it manually into each process?

You can find the message queue length via the function you mentioned. All of them are currently unhandled.

I don’t understand what you are trying to communicate.

According to my knowledge, when a process receives a message, if it doesn’t know what to do it with (because there is a missing handle_info for example) the process simply consumes the message, spits and error and moves on.

The key here is the phrase “consumes the message”, meaning it removes it from the message’s queue and moves on.

Is my understanding of message handling incorrect?

When the default implementation of handle_info spits out that log message, it has received and therefore handled it, from the BEAMs viewpoint at least.

So if the message is handled (consumed without error), shouldn’t the message_queue_len always be 0?

How does checking for message_queue_len help me find out which process are dealing with unhandled messages?

As far as I know there’s no built in metric for this. You’ll need to create your own handle_info call and instrument this yourself.

2 Likes

No one deals with unhandled messages, that’s what makes them unhandled.

You can ask the BEAM for a list of pids and then check their message queue lengths, if that is greater than 0, then there are messages which have not yet been read by that process. This might have different reasons. Either messages come in faster as you are able to process, or there is no receive clause (so far) that matches them.

Also if it’s only about that message, then you need to find a way to count them yourself. Should be trivial for any process you manage yourself.

I think you may be confusing what happens in a process and in a behaviour.

A process consumes messages using receive. Unconsumed messages are left in the process message queue until they are consumed with a receive, they never go away by themselves. message_queue_len returns the number of messages currently in the message queue. The system does not keep count of the total number of messages that have arrived to a process.

Note that there is no default handle_info or equivalent for a process.

The generic code in behaviours generally have a top-level receive loop which processes every message it finds in the message queue. Through the format of the message it knows which callback function to call to process the message. The handle_info callback is the one which is called when and unrecognised message arrives, one that hasn’t been sent with a call or a cast nor is an OTP system message. This is something which is handle by the behaviour so the message queue will never be allowed to build up.

The behaviour does not keep count of the total number of messages which has arrived at the process but it does keep count of the number of messages which it has processed in the top-loop. You can query this using :sys.statistics/2 to query the behaviour. Note that this is implemented by the generic behaviour code and is not not something that is implemented by the system for all processes.

A final note. The basic concurrency model in Erlang and Elixir is very very simple and is based on four main ideas: isolated processes, asynchronous sending of messages, selective receive, and timeouts in the receive. Everything else, for example how behaviours behave, is explicitly built on top of this.

Sorry this became a bit longer than planned. I hope correctly understood what you were asking about.

13 Likes