I am working for ArangoDB, a multi-model database implemented in C++. We have a huge number of integration/system tests that are written in JavaScript using a custom fully synchronous V8 implementation (no nodejs), and the testing framework itself is a huge mess. There are historical reasons for this situation, but they are not relevant here…
Going forward I was thinking about building a new testing framework that would hopefully make life for all the developers easier. The framework has to start, manage and monitor deployments (single server and cluster). If any of the server unexpectedly crashes or becomes unhealthy, we want to abort all tests. When the test process terminates, we also have to make sure that all of our started processes are terminated, etc. However, some threads need to test resilience and deliberately stop or crash servers, some tests need to run concurrent operations that have to coordinate, and much more…
I while ago I started to work on a PoC in Python - mainly because Python is already commonly used in the company. However, monitoring external processes with distributing crash notifications across multiple concurrently running threads proved to be quite complex. Not to mention the workarounds to do for pytest regarding state management…
I actually like Elixir much more than Python, so eventually I asked myself if Elixir might be a better fit here (and it would give me an excuse to promote Elixir a bit ;). With the erlexec library I have a powerful tool to manage the external processes and can then link everything so that tests are aborted when a server crashes. I have already built a first PoC and it looks quite promising. It can pretty much do what the Python PoC did, but it is much shorter and I would argue easier to understand.
I am curious what other people think, based on the description above. Has anyone built something similar in Elixir?
I am wondering if ExUnit would be the right test library here, or whether Common Test would be more appropriate. For example, one issue that I ran into with ExUnit is that when a server has crashed I cannot simply skip the remaining tests. I only learned about Common Test yesterday, so I don’t know much about it yet, but it seems that using it from Elixir is rather painful.
The main design goal is usability - it should be easy to use (run tests; analyze results) and to add new tests.






















