I’m currently writing integration tests for a large Phoenix app. The app involves multiple processes communicating with each other. During the integration tests, I discovered a bug. However, the test run was still reported as successful because the application supervisor restarted the crashing process. I want to ensure that no processes crash during my tests. I have tried several approaches, including using the new
:auto_shutdown option from Elixir 1.15, setting
:max_restarts in the test environment, and implementing a custom monitor to track all processes globally using
:erlang.trace_pattern. Is there a native way to achieve this? While I understand the importance of fixing the bug and writing a unit test, I’m also looking for ways to prevent crashes in the first place.
If the crash is logged to the stderr, I’d try using
capture_log and seeing there were 0 error logs during the run of my tests… it wouldn’t catch the crashes that could happen in between the tests, but maybe it’s good enough?
That’s also a suggestion ChatGPT gave me, and I guess its fine enough for CI, however yes it still suffers from the problems you mentioned.
I am not sure if this will work but I will just throw it.
I think during the
setup of your tests, is to list the existing pids with
on_exit run again
Process.list/0 and compare the results and make sure the all pids in the first run are present in the run when exiting the test.
It may not work, depending on how your supervision tree and the process that dies.
That seems really fragile, there’s going to be background processes in other dependencies that may come or go and that isn’t really related to behavior under test at all.
Are you testing against your app’s supervision tree or are you spawning a copy of that tree for each test? I generally recommend the latter because it makes it far easier to test the processes, and you get to keep doing
tree falls in the forest a process crashes during a spec run but nobody notices, did it really crash?
One way I’ve seen this manifest is code in a spec that
GenServer.casts a message but doesn’t assert on any of the resulting side-effects. The fix could be to use
:sys.get_state(target_pid) after the
cast, which will do two things:
- wait for the
cast to be handled and the target GenServer to check its mailbox
- ensure that
target_pid is still running
Just getting the state is sufficient for that purpose, but you could also assert further that whatever the
cast was supposed to do actually changed in the process’s state.