GenServer Call timeout when connected in circle


I want to implement something like this. Lets say I have 10 genserver nodes connected in the circle.
How are they connected ? Every genserver contains “PID” of its successor genserver.
Every genserver contains in the state some random “KEYS” as well. Lets say I start the entire system.
Now all the 10 genservers are up and running. Now I select one genserver and ask for the key using “” [I do this from iex shell] . The current genserver can contain the key or not. If it does not contain the key then the GenServer makes a call on the successor genserver using “”. This is how the cycle goes. Once the keys is found the answer is returned.

But my leads to a timeout. Let’s say I increase the timeout time for 10 nodes. But what if my system contains “1000000” nodes.

How to solve this problem ? I am very new to elixir.

Welcome to Elixir! I’m glad to see you’re jumping in and looking at GenServers. Unfortunately, the way you’re using them here has a few issues, one of which you’re running into now (another is that you’re copying the data N times).

Can you describe at a high level what you want to achieve? It sort of sounds like you want a sharded Key / Value store, is that right?


Yes, I want to create a distributed key/value store. The keys are distributed among ‘N’ genservers.

Is this for your learning purposes or is this for a production situation? If you’re trying to build something for production I’d just use Cachex, it has a distributed mode.

If this is for learning purposes, then let’s dig into what’s going on here a bit. First, can you provide some code? While daisy chaining GenServers isn’t a great approach, it still shouldn’t timeout with just 10. I wonder if you’re creating a deadlock somewhere.


This is a big point here, you don’t want to daisy-chain access, you’d want them each to manage a ‘shard’ of the total key space (like a hash of it is common) and name them based on the shard they manage, so then you can just directly query the one genserver that would have the data without needing to touch any of the others.


No, I am doing it for learning purposes only.

Yes, it seems to work for 10 gen_servers, but what if I have got 10M servers and every server has got huge ‘key/value’ table. That is when it is a problem.


Ah yes, that’d be a huge problem. This is why the daisy chain approach doesn’t work. As noted by @OvermindDL1 what you want is to have a hashing function that tells you which GenServer to ping, and then you ping just 1 genserver.

Here’s another issue with the dailsy chain approach. If it takes N calls to get the key, that’s N genservers that are all blocked waiting for a response. If multiple clients are trying to get keys, they’re going to block each other alot, which kills the point of using multiple GenServers in the first place instead of just 1.


What you are saying is totally right.

But my main question is “Is it possible to make genserver “calls” from one genserver to another genserver?”

Aren’t you already doing that?

Yes, I am doing it. But I am getting a timeout. If my implementation is right and assuming there is no deadlock, it should work right ?

If you’re getting a timeout, what did you mean by:

Yes, it seems to work for 10 gen_servers

Can you provide code? We can’t offer detailed help without it. If you aren’t building things with a deadlock, and if you aren’t doing something particularly slow in each GenServer, then it should not timeout with just 10 genservers.

1 Like

I will try to create a dummy project and will share the repo with you in some time.

I truly appreciate your help. Thank you so much for bearing with me.

1 Like

Well a quick dummy project that passes a value around and gets something that has it might be like:

defmodule Blah do
  use GenServer

  def start_link({range, next}), do: GenServer.start_link(__MODULE__, {range, next}, [])

  def set(pid, key, value), do: GenServer.cast(pid, {:set, key, value})

  def get(pid, key), do:, {:get, key})
  def init({range, next}), do: {:ok, {%{}, range, next}}

  def handle_cast({:set, key, value}, {values, myrange..fullrange = range, next})
  when rem(key, fullrange) == myrange do
    values = Map.put(values, key, value)
    {:noreply, {values, range, next}}
  def handle_cast({:set, key, value}, {_, _, next} = state) do
    set(next, key, value)
    {:noreply, state}
  def handle_call({:get, key}, _from, {values, myrange..fullrange = range, _next} = state)
  when rem(key, fullrange) == myrange do
    value = Map.get(values, key)
    {:reply, value, state}
  def handle_call({:get, key}, _from, {_, _, next} = state) do
    value = get(next, key)
    {:reply, value, state}

nop = spawn(fn -> receive do msg -> IO.inspect(msg, label: UnhandledHash) end end)
pid = Enum.reduce(9..0, nop, fn i, next -> Blah.start_link({i..10, next})|>elem(1) end)

Blah.get(pid, 42)
Blah.set(pid, 42, :boop)
Blah.get(pid, 42)

If every server has a huge key/value table then you want to fail fast, or make the call as parallel as possible.

One idea for failing fast is to have a range index, when the server receives the request it checks if it might have the key. If not it passes it on to the next server. If yes then does the lookup. This implies some ordering of the keys though.

One idea for parallelizing the request, when the server receives the request it spawns a Task which asks the next server in the daisy-chain. It then proceeds to check its own cache using another Task. Should it find it, it informs the next server to stop the search.

All in all, using a daisy-chain is O(n) and will get slower the more servers you add. This is unavoidable.

1 Like

How large is the data structure you are searching. The default time out for a is 5s. As long as your 10 servers take longer than 0.5s for the search each the initial will timeout.


I won’t go into a discussion about how to structure a service like this as others have already done that.

You had a few other questions as well:

  • Any process can make a, the only requirement is that the process you call is a GenServer[*]
  • The is synchronous and has a default timeout of 5 sec. You can set the timeout with a third argument which is the timeout in msec. You can also give it the value :infinity in which case the caller will wait forever.

You have to be careful with synchronous communication as it makes it very easy to hang the system.

[*] Or to be precise obeys the GenServer API. You can write your own if you wish, it is
an interesting exercise.


Hi @benwilson512 @tty @OvermindDL1 , Thank you so much for helping. It means a lot.

I just checked my code. After hours of debugging, I found out that I am actually creating a deadlock. I cant share the entire code here because I have to submit my project. My project was to implement a chord protocol. My genservers are arranged in the circle, and all of GenServers are connected to not only their neighbours but also other GenServers according to some specifications ( I will not bore you by getting into the nitty gritties of the algorithm).

That is why when I use “” , then the deadlock occurs and the all the servers hung up.

I think the best way to acheive this is to use Task.async, in order to query different GenServers accordingly as suggested by @tty

1 Like

Yes, that is why is happening to me. Please have a look here - GenServer Call timeout when connected in circle

This is the actual problem I am facing.

Thank you for the reply. It means a lot.

1 Like