Ensuring no process crashes when running ExUnit


I’m currently writing integration tests for a large Phoenix app. The app involves multiple processes communicating with each other. During the integration tests, I discovered a bug. However, the test run was still reported as successful because the application supervisor restarted the crashing process. I want to ensure that no processes crash during my tests. I have tried several approaches, including using the new :auto_shutdown option from Elixir 1.15, setting :max_restarts in the test environment, and implementing a custom monitor to track all processes globally using :erlang.trace_pattern. Is there a native way to achieve this? While I understand the importance of fixing the bug and writing a unit test, I’m also looking for ways to prevent crashes in the first place.

1 Like

If the crash is logged to the stderr, I’d try using capture_log and seeing there were 0 error logs during the run of my tests… it wouldn’t catch the crashes that could happen in between the tests, but maybe it’s good enough?

1 Like

That’s also a suggestion ChatGPT gave me, and I guess its fine enough for CI, however yes it still suffers from the problems you mentioned.

LOL I’m a bot

1 Like

I am not sure if this will work but I will just throw it.
I think during the setup of your tests, is to list the existing pids with Process.list/0, and on_exit run again Process.list/0 and compare the results and make sure the all pids in the first run are present in the run when exiting the test.

It may not work, depending on how your supervision tree and the process that dies.

1 Like

That seems really fragile, there’s going to be background processes in other dependencies that may come or go and that isn’t really related to behavior under test at all.

Are you testing against your app’s supervision tree or are you spawning a copy of that tree for each test? I generally recommend the latter because it makes it far easier to test the processes, and you get to keep doing async: true.

1 Like

If a tree falls in the forest a process crashes during a spec run but nobody notices, did it really crash?

One way I’ve seen this manifest is code in a spec that GenServer.casts a message but doesn’t assert on any of the resulting side-effects. The fix could be to use :sys.get_state(target_pid) after the cast, which will do two things:

  • wait for the cast to be handled and the target GenServer to check its mailbox
  • ensure that target_pid is still running

Just getting the state is sufficient for that purpose, but you could also assert further that whatever the cast was supposed to do actually changed in the process’s state.