lessless
How to stop OTP processes started in ExUnit setup callback?
Hello,
The module under test depends on three OTP process and thus they’re started in test setup callback:
setup do
accounts = TestAccounts.accounts()
{:ok, scheduler} = Enum.map(accounts, &Map.get(&1, :name)) |> Scheduler.start_link() # GenStage
{:ok, acc_supervisor} = AccountsSupervisor.start_link() # Supervisor
{:ok, provisor} = Provisor.start_link() # GenServer
{:ok, accounts: accounts}
end
I thought that they will be killed automatically after completion of each of the test case, but looks like it’s not the case - once in 3-4 runs a wild ** (MatchError) no match of right hand side value: {:error, {:already_started, #PID<0.2161.0>}} error begun to appear.
I managed to catch it both for Scheduler and for AccountsSupervisor.
The application supervision tree is:
workers = [
supervisor(Registry, [:unique, Postman.Registry]),
supervisor(AccountsSupervisor, []),
worker(Provisor, []),
worker(Scheduler, [Enum.map(accounts, &Map.get(&1, :name))])
]
First idea (confirmed by googling) was to stop those processes in on_exit function:
setup do
accounts = TestAccounts.accounts()
{:ok, scheduler} = Enum.map(accounts, &Map.get(&1, :name)) |> Scheduler.start_link() # GenStage
{:ok, acc_supervisor} = AccountsSupervisor.start_link() # Supervisor
{:ok, provisor} = Provisor.start_link() # GenServer
on_exit fn ->
Supervisor.stop(acc_supervisor)
GenServer.stop(provisor)
GenStage.stop(scheduler)
end
{:ok, accounts: accounts}
end
That led to a whole new bunch of other errors/complaints:
-
Supervisor.stop(acc_supervisor)produce
** (exit) exited in: :sys.terminate(#PID<0.572.0>, :normal, :infinity)
** (EXIT) shutdown
I think this is just a notification message, but I would really really like to avoid capturing errors for all tests where a Supervisor should be stopped.
-
GenServer.stop(provisor)produce
** (exit) no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started
-
GenStage.stop(scheduler)produce
** (exit) no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started
Here you can see some clear contradictions(race conditions) - sometimes processes are still running, sometimes they’re not.
Lastly I wrote a function to overcome that problem which kills process only if it’s alive:
def kill_if_alive(pid) do
case Process.alive?(pid) do
true -> Process.exit(pid, :kill)
_ -> :ok
end
end
After that, an even stranger race condition in one of the tests started to appear.
test "start all accounts", ctx do
assert Supervisor.which_children(AccountsSupervisor) |> length() == 0
assert Provisor.start_all_accounts(ctx.accounts) |> length() == length(ctx.accounts)
assert Supervisor.which_children(AccountsSupervisor) |> length() == length(ctx.accounts)
end
Assertion with == failed
code: Supervisor.which_children(AccountsSupervisor) |> length() == length(ctx.accounts())
left: 1
right: 2
stacktrace:
test/processor/provisor_test.exs:26: (test)
Provisor.start_all_accounts spawns a bunch of supervisors under AccountsSupervisor and thus they’should be stopped with AccountsSupervisor
This situation is utterly confusing and I hope somebody can clarify what’s going on and how to properly stop those processes.
Most Liked
josevalim
There is no need for a mini-project.
This is how ExUnit works.
@lessless the processes you start in setup are linked to the test process. This means that, when the test finishes, those processes will asynchronously terminate since the link between those processes and the test process is broken.
That’s why you have races: there is no guarantee those linked processes will terminate before the next test starts. Also, because on_exit runs after the test process exits, the linked processes may be running or have already died, that’s why Supervisor.stop and friends may fail or not.
Overall, it is the same race conditions. The processes you spawn may or may not have exited by the time you run on_exit or the next test starts.
That said, all you need to guarantee is that those processes are DOWN in the on_exit callback, making sure you have a client slate for the next test run. Since Process.monitor/1 won’t fail if you give it a dead process, it suits the bill perfectly. You should add this function to your codebase:
defp assert_down(pid) do
ref = Process.monitor(pid)
assert_receive {:DOWN, ^ref, _, _, _}
end
And call it for every named processes to have a beautifully green test suite.
We are in the process of making this simpler for Elixir v1.5 by starting a supervisor per test and allowing you to start processes under the test supervisor. This means we can cleanly shut everything at the end of the test without user intervention. Stay tunned. 
lessless
Thank you @josevalim, you saved day once again! I believe that should get on elixir radar, 'cause there is a chance that this behavior wasn’t explained anywhere before.
LostKobrakai
There’s start_supervised, which will make the process be managed for the livecycle of the test running.







