How do you write unit tests for interactions between nodes?

Right now I am writing an Elixir application that is meant to be run in multiple nodes. When doing manual testing I run the application in different iex sessions using “iex --sname a@localhost -S mix” and call other nodes’ GenServers (for example one named Server) using “GenServer.call({Server, :b@localhost},{:write,1,2}).” This works for now but is tedious. I would like to write unit tests to automate this process, but can’t find any information on how to. What is the best way to get about doing this?

1 Like

Check out https://github.com/bitwalker/ex_unit_clustered_case by @bitwalker. It’s fairly new but allows you to easily test multi node scenario’s.

3 Likes

Look at the Erlang slave module.

There are some pain points with :slave, which is ultimately why I wrote ex_unit_clustered_case - for simple cases, it works fine, a few of my libraries were basically built around testing via :slave; but for more complex applications the challenges start to become really noticeable.

The biggest thing you run into is realizing that just spinning up a few nodes isn’t enough - you generally want to test a lot of scenarios, and in a clustered application many of those scenarios will clobber other tests. This means you need to orchestrate clusters, not just nodes. Another problem is that you can get into situations where you have started nodes that don’t die when your tests fail, leaking beam processes. You have to have a mini framework built just to handle things like initializing nodes with different initial states for tests, etc.

Ultimately rebuilding all of that again and again is why I ended up throwing a library together. Ideally something would be baked into Elixir for this, but it’s definitely a lower priority in comparison to other things going on; in the meantime, let me know if my library covers your use cases, I’d like to make it a general solution to the problem

4 Likes

Something else I’ve been working on off and on is the ability to re-order rpcs between processes on different nodes. Its not quite ready to see the light of day but I’ve been doing this with a shim module that all rpcs are serialized through and proper’s statem modules to control interleavings and failure scenarios. I think there’s some promise there and its at least based on some sound research.

An even more interesting concept would be to use something like lineage driven fault injection: https://people.ucsc.edu/~palvaro/molly.pdf. I suspect you’d have better luck exploring the state space this way. I’ve been thinking about how to implement something like this in elixir but haven’t put any real effort into a solution yet.