So I have something like that:
SystemManager (GenServer)
RegistrationsSupervisor
children: [Registration (GenServer), Registration (GenServer) ... ]
It’s a top level worker SystemManager, that starts new registration whenever user visits registration page. Such registration is supervised and contains user’s progress during registration (it’s a wizard).
This set up works pretty well so far, however, I get random test failures. Usually it’s all green, but I get:
1) test should initialize a registration by spawning a new registration and return a token (HC.Core.SystemManagerTest)
test/core/system_manager_test.exs:12
** (EXIT from #PID<0.222.0>) an exception was raised:
** (MatchError) no match of right hand side value: {:error, {:already_started, #PID<0.217.0>}}
(core) lib/core/registrations_supervisor.ex:9: HC.Core.RegistrationsSupervisor.start_link/0
(core) lib/core/system_manager.ex:17: HC.Core.SystemManager.init/1
(stdlib) gen_server.erl:328: :gen_server.init_it/6
(stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
I don’t load the application for running ExUnit at all, i.e. I added this to my mix.exs
:
aliases: [test: "test --no-start"]
This prevents the :core
application from booting. I start the SystemManager manually in setup
block of the test suite:
{:ok, _pid} = SystemManager.start_link()
SystemManager starts RegistrationsSupervisor in it’s init
callback.
Okay, so there is a race condition in such set up. I think the problem is that whenever a test case process terminates, it brings down all linked processes including my SystemManager. I want that, it’s cool. Next, the situation happens again, that the SystemManager when terminates brings down all linked processes including RegistrationSupervisor. I also want that, I want to clean up the whole process tree I started in setup
.
But the whole killing children thing (sic!) is happening in parallel, i.e. it’s asynchronous. Next test already starts and tries to start SystemManager, and it tries to start it’s own instance of RegistrationsSupervisor from it’s init
, and it can happen before the cleanup from previous test case finished.
I am currently dealing with this by using a :timer.sleep(10)
in my setup blocks. But maybe there is a way in Erlang/Elixir to wait for the process and all it’s children to be fully terminated instead?
If not, any tips on how people test such scenarios?