GenServer asynchronous calls with error management

2Sjch8AT · January 16, 2018, 9:24am

I have a GenServer which is using an array of other GenServer to fullfil its jobs, each of them being connected to different channels for sending data. The first GenServer is acting like a dispatcher. When a call on the first GenServer is invoked, one of the GenServer in the array is chosen and a call is performed on it.

I would like the GenServer (dispatcher) to be asynchronous: I don’t want to block while the data is sent to one of the GenServer contained in the array. But I would also ilke to check for errors, so using GenServer.cast is not an option.

What options do I have? Does it make sense to use spawn and GenServer.call for each incoming request coming to the dispatcher?

amikhailov · January 16, 2018, 9:51am

Does it make sense to use spawn and GenServer.call for each incoming request coming to the dispatcher?

Erlang rpc implements asynchronous calls exactly like this: otp/lib/kernel/src/rpc.erl at 613b5f890a5bc13aaf64cc31b535262a40eba721 · erlang/otp · GitHub

So yes, I think it’s the way to go.

peerreynders · January 16, 2018, 5:26pm

That doesn’t make the dispatcher asynchronous - and this tactic is largely a defensive move by a client process forced to make a synchronous call when it doesn’t want to be blocked.

But not wanting to be blocked the client process still has to:

eventually process the result from the terminating proxy-process
monitor that proxy-process in case it terminates before sending the result

Given that you are implementing the dispatcher you might as well cut out the “proxy-process”:

Have the client process cast/2 the request to the dispatcher (include a reference from Kernel.make_ref/0 in the request that the dispatcher can return in the response information to make it easier to correlate a response to a request).
Have the client process Process.monitor/2 the dispatcher.
If the client receives a :DOWN (via handle_info/2) message with the dispatcher-monitor-reference and dispatcher PID, inspect the reason
:noproc - there is no process with pid
:noconnection - can’t reach specified node
:normal - no worries, sent result is still in the queue (provided a result was sent)
other non-normal exit reasons
When the dispatcher has completed the request have it cast/2 the response information (including the correlation reference) back to the client process.
When the client receives the response information from the dispatcher, immediately Process.demonitor/2 the dispatch-monitor-reference (specifying [:flush] for opts to purge the :DOWN message that could still be in the message queue) and then process the response information.

See also: The need for monitoring

2Sjch8AT · January 22, 2018, 9:23am

That doesn’t make the dispatcher asynchronous

Could you explain to me why?

Thanks for your detailed answer.

peerreynders · January 23, 2018, 3:19pm

The intent of call/3 is to implement a synchronous protocol on top of a native asynchronous one.

call boils down to the client sending a request and then blocking until a reply for that exact request arrives - ignoring any other messages that may arrive in the meantime.

Now typically the server processes the request immediately in handle_call/3 and responds with a :reply.

So the intent behind call/handle_call is to implement a synchronous protocol.

Now the server has some wiggleroom as it can finish handle_call with :noreply to delay emitting a reply until later with reply/2 but that doesn’t change the synchronous client experience (the server might do this to stop the same client from sending additional requests).

The “spawn to call” tactic is used as a countermeasure by a client process which doesn’t wish to have the “synchronous experience” that call enforces.

Using “spawn to call” within the “service” (rather than on the client-side) may give the client process the illusion of asynchronous communication (while still leaving the burden of setting up the monitor and dealing with monitor messages) - but call still implements a synchronous protocol; for what purpose?

(My guess is that the dispatcher core has been designed/written in terms of handle_call which means that the dispatcher implementation is in fact synchronous. Such an implementation could be, not necessarily, but quite possibly blocked while processing a single request rather than servicing other requests asynchronously.)

If the dispatcher is asynchronous and the clients need to deal with it asynchronously then simply have the dispatcher serve/implement cast/handle_cast instead (while the dispatcher casts its reply back to the client) to fully reveal the dispatcher’s asynchronous nature.