Hello,
The module under test depends on three OTP process and thus they’re started in test setup
callback:
setup do
accounts = TestAccounts.accounts()
{:ok, scheduler} = Enum.map(accounts, &Map.get(&1, :name)) |> Scheduler.start_link() # GenStage
{:ok, acc_supervisor} = AccountsSupervisor.start_link() # Supervisor
{:ok, provisor} = Provisor.start_link() # GenServer
{:ok, accounts: accounts}
end
I thought that they will be killed automatically after completion of each of the test
case, but looks like it’s not the case - once in 3-4 runs a wild ** (MatchError) no match of right hand side value: {:error, {:already_started, #PID<0.2161.0>}}
error begun to appear.
I managed to catch it both for Scheduler
and for AccountsSupervisor
.
The application supervision tree is:
workers = [
supervisor(Registry, [:unique, Postman.Registry]),
supervisor(AccountsSupervisor, []),
worker(Provisor, []),
worker(Scheduler, [Enum.map(accounts, &Map.get(&1, :name))])
]
First idea (confirmed by googling) was to stop those processes in on_exit
function:
setup do
accounts = TestAccounts.accounts()
{:ok, scheduler} = Enum.map(accounts, &Map.get(&1, :name)) |> Scheduler.start_link() # GenStage
{:ok, acc_supervisor} = AccountsSupervisor.start_link() # Supervisor
{:ok, provisor} = Provisor.start_link() # GenServer
on_exit fn ->
Supervisor.stop(acc_supervisor)
GenServer.stop(provisor)
GenStage.stop(scheduler)
end
{:ok, accounts: accounts}
end
That led to a whole new bunch of other errors/complaints:
-
Supervisor.stop(acc_supervisor)
produce
** (exit) exited in: :sys.terminate(#PID<0.572.0>, :normal, :infinity)
** (EXIT) shutdown
I think this is just a notification message, but I would really really like to avoid capturing errors for all test
s where a Supervisor
should be stopped.
-
GenServer.stop(provisor)
produce
** (exit) no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started
-
GenStage.stop(scheduler)
produce
** (exit) no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started
Here you can see some clear contradictions(race conditions) - sometimes processes are still running, sometimes they’re not.
Lastly I wrote a function to overcome that problem which kills process only if it’s alive:
def kill_if_alive(pid) do
case Process.alive?(pid) do
true -> Process.exit(pid, :kill)
_ -> :ok
end
end
After that, an even stranger race condition in one of the tests started to appear.
test "start all accounts", ctx do
assert Supervisor.which_children(AccountsSupervisor) |> length() == 0
assert Provisor.start_all_accounts(ctx.accounts) |> length() == length(ctx.accounts)
assert Supervisor.which_children(AccountsSupervisor) |> length() == length(ctx.accounts)
end
Assertion with == failed
code: Supervisor.which_children(AccountsSupervisor) |> length() == length(ctx.accounts())
left: 1
right: 2
stacktrace:
test/processor/provisor_test.exs:26: (test)
Provisor.start_all_accounts
spawns a bunch of supervisors under AccountsSupervisor
and thus they’should be stopped with AccountsSupervisor
This situation is utterly confusing and I hope somebody can clarify what’s going on and how to properly stop those processes.