GenServer asynchronous calls with error management

I have a GenServer which is using an array of other GenServer to fullfil its jobs, each of them being connected to different channels for sending data. The first GenServer is acting like a dispatcher. When a call on the first GenServer is invoked, one of the GenServer in the array is chosen and a call is performed on it.

I would like the GenServer (dispatcher) to be asynchronous: I don’t want to block while the data is sent to one of the GenServer contained in the array. But I would also ilke to check for errors, so using GenServer.cast is not an option.

What options do I have? Does it make sense to use spawn and GenServer.call for each incoming request coming to the dispatcher?

Does it make sense to use spawn and GenServer.call for each incoming request coming to the dispatcher?

Erlang rpc implements asynchronous calls exactly like this: otp/lib/kernel/src/rpc.erl at 613b5f890a5bc13aaf64cc31b535262a40eba721 · erlang/otp · GitHub

So yes, I think it’s the way to go.

1 Like

That doesn’t make the dispatcher asynchronous - and this tactic is largely a defensive move by a client process forced to make a synchronous call when it doesn’t want to be blocked.

But not wanting to be blocked the client process still has to:

  • eventually process the result from the terminating proxy-process
  • monitor that proxy-process in case it terminates before sending the result

Given that you are implementing the dispatcher you might as well cut out the “proxy-process”:

  • Have the client process cast/2 the request to the dispatcher (include a reference from Kernel.make_ref/0 in the request that the dispatcher can return in the response information to make it easier to correlate a response to a request).
  • Have the client process Process.monitor/2 the dispatcher.
  • If the client receives a :DOWN (via handle_info/2) message with the dispatcher-monitor-reference and dispatcher PID, inspect the reason
  • :noproc - there is no process with pid
  • :noconnection - can’t reach specified node
  • :normal - no worries, sent result is still in the queue (provided a result was sent)
  • other non-normal exit reasons
  • When the dispatcher has completed the request have it cast/2 the response information (including the correlation reference) back to the client process.
  • When the client receives the response information from the dispatcher, immediately Process.demonitor/2 the dispatch-monitor-reference (specifying [:flush] for opts to purge the :DOWN message that could still be in the message queue) and then process the response information.

See also: The need for monitoring

That doesn’t make the dispatcher asynchronous

Could you explain to me why?

Thanks for your detailed answer.

The intent of call/3 is to implement a synchronous protocol on top of a native asynchronous one.

call boils down to the client sending a request and then blocking until a reply for that exact request arrives - ignoring any other messages that may arrive in the meantime.

Now typically the server processes the request immediately in handle_call/3 and responds with a :reply.

So the intent behind call/handle_call is to implement a synchronous protocol.

Now the server has some wiggleroom as it can finish handle_call with :noreply to delay emitting a reply until later with reply/2 but that doesn’t change the synchronous client experience (the server might do this to stop the same client from sending additional requests).

The “spawn to call” tactic is used as a countermeasure by a client process which doesn’t wish to have the “synchronous experience” that call enforces.

Using “spawn to call” within the “service” (rather than on the client-side) may give the client process the illusion of asynchronous communication (while still leaving the burden of setting up the monitor and dealing with monitor messages) - but call still implements a synchronous protocol; for what purpose?

(My guess is that the dispatcher core has been designed/written in terms of handle_call which means that the dispatcher implementation is in fact synchronous. Such an implementation could be, not necessarily, but quite possibly blocked while processing a single request rather than servicing other requests asynchronously.)

If the dispatcher is asynchronous and the clients need to deal with it asynchronously then simply have the dispatcher serve/implement cast/handle_cast instead (while the dispatcher casts its reply back to the client) to fully reveal the dispatcher’s asynchronous nature.