GenServer docs: "handle_cast ... should be used sparingly" Why?

During my recent query with regard to “Functional Web Development with Elixir, OTP, and Phoenix”, Lance Halvorson kindly directed my attention toward the GenServer Documentation:

  1. handle_call/3 must be used for synchronous requests. This should be the default choice as waiting for the server reply is a useful backpressure mechanism.
  2. handle_cast/2 must be used for asynchronous requests, when you don’t care about a reply. A cast does not even guarantee the server has received the message and, for this reason, should be used sparingly.

Can somebody please direct me towards some material that may explain how we arrived at these particular recommendations?

My current puzzlement is based on the observation that many modern architectures rely less and less on “synchronous” operations and are moving more into a pipelined model where {event,current_state} enters the processing pipeline and new_state pops out of the other end - and perhaps more importantly that the provider of event isn’t necessarily designed to be the consumer of new_state. As simple examples I would point to React’s Flux, The Elm Architecture, and perhaps to some extent ReactiveX (when done correctly).

In essence I would have thought that waiting an for an “unnecessary response” adds an unnecessary interaction dependency that unnecessarily reduces concurrency - realizing this would ultimately lead to to a design style where one would strive to make responses unnecessary wherever possible - at which point cast/2 / handle_cast/2 would become the defacto default interaction (rather than call/3 / handle_call/3).

“Designing for Scalability with Erlang/OTP; Chapter 4: Generic Servers - Message Passing - Asynchronous Message Passing” p.84

In some applications, client functions return a hardcoded value, often the atom ok, relying on side effects executed in the callback module. Such functions could be implemented as asynchronous calls.

On the topic of backpressure DSEO talks about backpressure/load regulation frameworks like jobs and Safetyvalve. There really isn’t an indication that trying to manage backpressure at the granularity of a single message exchange is a good idea.

“Designing for Scalability with Erlang/OTP; Chapter 15: Scaling Out - Load Regulation and Backpressure” p.421

Start controlling load only if you have to. When deploying a website for your local flower shop, what is the risk of everyone in town flocking to buy flowers simultaneously? If, however, you are deploying a game back end that has to scale to millions of users, load regulation and backpressure are a must.

That being said forcing call/3 based requests can be a legitimate tactic to prevent individual client processes from overwhelming a server with requests as described in Building Non Blocking Erlang Apps. Essentially a GenServer isn’t actually obliged to immediately return a reply in handle_call/3 but can choose to answer the request later with reply/2 - i.e. the server can keep the client blocked while not being blocked itself.


Most of this recommendations/guidelines comes from the fact that people usually doesn’t understand how GenServer works or even the consequences of concurrency. The first thing you must keep in mind is that a GenServer is a single process(Erlang VM process) structure, it is made to reduce boilerplate code on handling processes. That being said, doesn’t matter if you use cast or call, in the other side it will be only a process to handle all that messages.

Can somebody please direct me towards some material that may explain how we arrived at these particular recommendations?

This comes from the notion people have about async and how they think it works. Unless you have a particularly crafted GenServer, as I said above, no matter if you use cast or call there will be only one process taking care of processing the messages sent to it. The point on recommending to use call, is to lock the caller and reduce the number of processes calling that specific GenServer, otherwise you will be calling cast throughout your code on the same process and that GenServer will become a bottleneck. So keep in mind the consequences of using cast.

About the other things you argue about, most of them are covered by GenStage.


I suspected as much - but recommendations like this, at least in my experience, can easily turn into “dogma” (i.e. "handle_cast/2 is bad") all too easily wielded by the expert-beginners out there.

While novices want hard-and-fast rules, I think they need to face the fact that the call vs cast debate is simply an “it depends” issue. My current thinking is along the lines of:

  • Can I get away with just a cast?
  • If I think I need a call - can I reorganize the interaction patterns so that I can just use a cast? i.e. forward the “result” of the cast to an independent (concurrent, third) process to act on it?
  • Use a call if the previous two options result in “unreasonable” concessions.

all the while remaining aware of any bottlenecks in the system where tons of messages could be piling up in a mailbox somewhere. I feel that this thinking is quite far removed from "handle_call/3 should be the default choice".
In the end a successful developer needs to consciously realize that the code inside the process is utterly sequential (which makes it easy to reason about) while all the concurrency is happening between processes.

call can prevent a single process from dumping a million messages in your mailbox - it cannot prevent a million processes from putting a single message each in your mailbox.

Reminds me of CSP - certainly seems like the way to go once it has been established that backpressure is necessary.

Thank You for your feedback!


I guess you’re considering the Getting Started guide as documentation. The actual documentation for GenServer is this one:

The doc has an approach more like “it’s your responsability in which to use” and even guide you to where you can learn more. The Getting Started guide expects someone learning Elixir/Erlang/OTP, and thats the whole point of a Getting Started guide.

For sure, as I said, the guides are for people who are learning. That’s why it says to approach cast with caution. I agree that it could be more clear, but I think the person looking to a heavy usage of GenServer will look at the docs and read more about it.

1 Like

I believe that the second sentence of the second point you quoted provides the most important reason:

A cast does not even guarantee the server has received the message and, for this reason, should be used sparingly.

I usually advise that if you’re unsure which one you need, a call is likely a better default. Cast can hurt you in some strange ways. A message can become lost (and it’s completely unclear that it happened), or if the message queue builds up, the performance of the entire system might suffer, or you might run out of memory.

Call is more explicit here, because a client gets a feedback, meaning you can find out whether a server has processed a message, or crashed in the process. Moreover, a call bottleneck is less likely to affect the entire system (though it’s still possible of course).

Both are definitely valid options, and there are definitely cases where cast is a better choice. IIRC, I’ve even occasionally used 2-way casts in place of a call, but I’d have to search my memory for the exact reasons why I did it.


The current version of the documentation is actually rather “non-commited” - it simply presents both options without any judgement or endorsement. However one potential disadvantage of cast/2 is this

This function always returns :ok regardless of whether the destination server (or node) exists

(I guess this is true for pid - typically with name there is a failure).

So if receipt of the message is important call has to be used anyway because the reply is necessary for acknowledgement (and alternately failure (timeout) to trigger failure handling).

My main concern with handle_call/3 as the default choice, together with Interface functions, is that it is going to let many developers stay with a blocking-procedure-call mindset (Convenience Over Correctness) rather than designing servers that acknowledge their messages quickly, even if it potentially means delaying result dispatch.
I feel that there needs to be a very deliberate choice of designing interface functions as commands (“do this”) or queries (“I need this value”) - rather than pretending that interface functions are just like any other in-process function.


Somehow that description failed to evoke the picture of the destination server being entirely absent. I think I got distracted by the backpressure justification in handle_call.

Having had some time to think about I can see a definite use for an interaction pattern like

  • call client request with an acknowledgement reply from server
  • Followup cast from server to client with the actual “result”

Thank You for chiming in!


You can reply early from a call if you just want a receipt of arrival.

def handle_call(:foo, from, state) do
  GenServer.reply(from, :ok)


  {:noreply, state}

Interesting approach.

def handle_call(:foo, from, state) do
  GenServer.reply(from, :ok) # !!! I almost missed this part !!!

  new_state = do_work()

  {:noreply, new_state}

To me this further illustrates that one needs to be very clear on what the reply really “means” and what the reply is meant to accomplish.

  • In the simplest and most common case one is simply waiting for a result that one needs to continue - though that opens up other questions that could be worth consideration:
    A) Should I be waiting for a result or should I be re-organizing my logic to work in a GenStage kind-of-way?
    B) Do I really want to be blocked for the result?
  • Waiting for an :ok reply can mean
    a) I just want to make sure you got the message - past that if something goes wrong it’s entirely your (and your supervisors) problem.
    b) I really need to know that my request was processed to completion - if it didn’t, I need to “let my supervisor know” (i.e. I need to fail).

Building Non Blocking Erlang Apps is a nice tactic for a server to remain responsive even when dealing with slow services/resources. But what if a client of such a server doesn’t want to be blocked either?

This is where I thought the “call-with-acknowledge/cast-result-back-to-client” interaction pattern might come in handy. But that would mean having to rework the server’s interface. It probably would be simpler and better for the client to spin-off a proxy process (Task.start[_link]?) for the sole purpose of handling one request with that server - it doesn’t matter if the proxy process is blocked - because that’s it’s job. The client process can go off and do it’s thing, until the proxy reports back with a result or error (and terminates).

A one-off proxy process may also be better when dealing with timeouts because stale messages wouldn’t have a mailbox to go to, so the “real” client isn’t burdened with clearing out stale messages.

The point being process interaction isn’t simply limited to cast and call but can be customized with process links, monitors and other (short lived) processes; so it would be a big mistake to mentally equate a call to a process to a function invocation when in fact process interactions can be handled in many and more nuanced ways.

Yes - I know it looks like I’m overthinking (over-engineering) this but

In preparing for battle I have always found that plans are useless, but planning is indispensable. Dwight D. Eisenhower

i.e. it’s important to know the full spectrum of your available options - or in my case before choosing the simplest possible approach I want to assess all available options first so that I can legitimately select the simplest option as the appropriate and best option - rather than just selecting something “by default”.


The early-reply trick depicted above, and the reply-later trick used in the linked article, just blew my mind.

I don’t know why neither ever occurred to me––I’ve know about the pieces to implement both strategies for a while.

10/10 would be amazed again


4 posts were split to a new topic: Finding Vampire Numbers

Can anyone share the linked article? Sadly can’t find it on