nikody
GenServer blocking in async task
I have something similar to the following code:
genserver_name = "cool_name"
task = Task.async(fn -> MyModule.start() end)
:timer.sleep(500)
GenServer.cast(genserver_name, {:save, first_payload})
GenServer.cast(genserver_name, {:save, second_payload})
GenServer.cast(genserver_name, {:save, third_payload})
response = Task.await(task)
MyModule.start/0 is a normal function (MyModule is not a GenServer) that starts a GenServer with the name of "cool_name" (the name iss always different, but I don’t think this is important for the problem).
Additionally, after starting the GenServer, MyModule.start/0 is blocked and waits for the GenServer to receive messages from other processes. It checks the GenServer state every second with a GenServer.call call and if it has the expected data, it unblocks and continues with processing it.
It normally receives the messages that update its state as expected in a few minutes, but I wanted to write a test for it, so I used the aforementioned code.
What I find strange is that if I remove the :timer.sleep/1 in the code above, the Task.await/1 times out and the GenServer.cast calls don’t get called at all and the GenServer state is always empty, the MyModule.start/0 function checks it every second with a GenServer.call and it is always empty. If I add the :timer.sleep/1, it is empty, for instance, only on the first check, then the casts run, the state gets populated, and everything is fine, the Task.await/1 does not raise an exception.
I wonder what is the reason for this blocking without the :timer.sleep/1 and why the addition of the timer solves it. I’m guessing it has something to do with the GenServer.call that checks the state every second, but I don’t think I can find the explanation with my current knowledge - it is called once every second and immediately returns the state (which in this case is just an empty map).
I am running this in an ExUnit test, but I think this is also not related to this specific issue.
Marked As Solved
jtsummers
Because you’re spawning a task that then spawns the server you’re hitting a race condition. The named server doesn’t exist yet. Rather than using a timer (which either wastes time or cuts it too close and has a chance of failure), you can try GenServer.whereis. If it returns nil then the named server hasn’t been started yet. You could also have the spawned task send a message back to this process signaling that it has started the server, then use a receive in this process to wait before casting. That last one is what I would probably do.
Also Liked
jtsummers
ityonemo
One question that left me scratching my head: How were you able to get a GenServer named with a binary? Typically names are only allowed to be atoms, {:global, term}, or {:via, module, term}.
I feel like there’s something unidiomatic going on under the hood which is causing these issues. If you don’t mind a few suggestions:
- Unless you have a really really really good reason to, you should typically be launching a GenServer from a function with the
startorstart_linkof its own module. You might not do it if you have some sort of abstraction generator (I really don’t want to say “factory”, because, yuck). - Whether or not it’s coming from its own module, Conventionally a
startorstart_linkfunction should probably output something that looks like{:ok, pid}, which you can then use to do yourGenServer.cast/calls - Generally my rule of thumb is “you shouldn’t be using
GenServer.castunless you really know what you’re doing”. as you see herecasts will silently fail; ifcalltimes out or drops an error, that might be saying something about your plans. When you deploy to prod, a failing call will be quite noisy and so you can track those and fix your code (or not, if it’s rare/innocuous/expected enough not be worth the effort) - GenServers are fundamentally an async- sort of thing, so wrapping a spawning a GenServer inside a
Task.asyncin a test seems like a red flag; the test should probably be launching the GenServer itself. The pattern gives you a lot of power;startandstart_linkblock on the completion ofinit, and should contain bootstrapping bits that live in the calling process; andinitcontains the bootstrapping bits that live in the child process. If you need to unblock the caller early before bootstrapping is done, and some bootstrapping things on the child process’ time, you can issue a:continuedirective. There’s also tools likeConnectionlibrary for other special cases.
Not to be too prescriptive, but It seems like you should be able to refactor your modules to conform to these parameters.
nikody
@lud @ityonemo Thank you for your responses!
@lud yes, you are right, but it knows the correct name. In the actual use case, this module receives various parameters and generates the GenServer name based on them. I just wanted to simplify the example and didn’t specify this, I’m sorry about that. Then, the other processes that update the GenServer state (which are actually controller actions), also receive a parameter that lets them know the name of the GenServer.
@ityonemo your observations are correct. In simplifying the code for my question, I made a few errors. We use {:global, "different_id"} for the GenServer name, and the GenServer itself is started with ServerName.start_link({:global, "different_id"} and in start_link/1 it has GenServer.start_link(__MODULE__, %{}, [name: name]). I hope this answers the first half of your response.
About the call and cast - I think you have a point. It would be safer with call, there is no concrete reason to use cast I think, although so far we have had no problems with it. I did not originally write this functionality, but I am trying to refactor it at the moment, and this suggestion is good, thank you!
As for the test itself, well, the case itself is a bit strange. I want to the test the module that starts the GenServer, not the GenServer itself. So in the test I call the module and it starts the GenServer, but as I pass the parameters to the module, I know the GenServer name.
Normally, this module waits for HTTP requests to an endpoint in the same application - the GenServer.cast calls are made in the controller once per request. Once all requests have been received, the GenServer’s state is complete and the other module unblocks and continues processing these responses. With these casts in the test I try to simulate that. Actually, it might be better if I mock the requests themselves and call the controller in the test, but I don’t find it necessary as I’m not aiming for an integration test, just a test for one specific module.
I know the blocking part while waiting for the GenServer state to get update from elsewhere might seem strange and has a downside, I will probably refactor this too at some point, but it will requite a lot of changes to other parts of the application too, that is why for the time being I want to test the current functionality.
In addition, @jtsummers, thank you fro the diagram and the suggestions!







