Feature Requests - ExUnit

I’d like to discuss two proposals to improve ExUnit and make it easier to write alternative testing frameworks which cooperate nicely with ExUnit.

Feature Request #1 - Access to the list of sync and async test modules

There should be a way of accessing the list of sync and async modules in the ExUnit.Server genserver.

Basically, the equivalent of the following code:

:sys.get_state(ExUnit.Server)

Motivation

Currently, the only way of telling ExUnit which modules to run is by adding modules to the stateful ExUnit.Server, using the add_sync_module/1 and add_async_module/1 functions.

This API is slightly inconvenient but very easy to use. However, as far as I can tell, there is no way of retrieving the sync and async modules that have been registered with ExUnit.Server. Those modules are registered when the test suite is compiled, and are “consumed” from the genserver when they are run. After the test suite has run, the genserver is empty and must be filled again. One way of refilling the genserver is to compile the test files again, but that has two problems:

  1. It’s very slow - sometimes running a test suite can take milliseconds and compiling the same test suite can take 1 or more seconds

  2. Compiling the same module a repeatedly can cause the BEAM’s literal allocator to consume too much memory and crash the BEAM (it happens after about ~250 recompilations in my case)

I’m having both of these problems in my mutation testing framework. I can generate hundreds of mutations in the same module, which means running the test suite must be very fast and can’t saturate the literal allocator.

It doesn’t change the semantics of ExUnit.

Implementation

This feature as it stands is trivial to implement.

One needs only a new genserver call in the ExUnit.Server module.

Criticisms

ExUnit.Server is probably meant to be an implementation detail.

I’m not sure we are meant to meddle with it or depend on it being present.

Feature Request #2 - Stop the test suite right after a test fails

Currently we can configure ExUnit to stop after a number of failures, using the :max_failures option. However, this option doesn’t do what one might think it does. Instead of stopping the test suite (including modules that are being tested in parallel) just after the test has failed, it will only stop the test suite after the current test case has run.

I think the test suite should stop after the number of failures has been reached instead of the current behavior.

Motivation

Same motivation as above. I’m writing a mutation testing framework which has to run a test suite hundreds or thousands of times. If I can abort the tests as soon as a single test has failed, it can save a lot of time.

This is also true for other situations, like continuous integration pipelines.

Implementation

There are several ways of implementing this. The obvious one is to spawn all processes related to the test suite from a single master process and send a kill signal to that process, but there might be others.

Criticisms

This feature probably requires some radical refactoring of ExUnit’s code base. It also changes the semantics of ExUnit and I don’t think it can be done without at least changing the map returned by ExUnit.run(). Currently the map has the following shape:

%{excluded: 0, failures: 86, skipped: 0, total: 161}

We would have to add a new key to the map to describe tests that weren’t run because one of the tests failed.

5 Likes

Pretty good proposal overall. I’ll ask a couple questions where things were unclear for me.

Feature Request #1

Could you keep track of the modules yourself? For example, test’s could choose to use fuzzing with something like:

defmodule MyTest do
  use ExUnit.Case
  use Fuzzing
end

And then, in the after compile callback, Fuzzing could keep track of the modules itself. Then it could presumably resubmit the modules to ExUnit. Might still need a proposal since add_sync_module/1 and add_async_module/1 aren’t public API.

Feature Request #2

Why not put them in skipped?

2 Likes

Yes, but I think it would be better if I could run the tests from the “outside” instead of injecting my own code. But just making the add_*_module() functions public would probably be enough.

They could be put in skipped. It’s the same to me.

EDIT: OTOH, maybe the best solution is to fork ex_unit into Darwin, place it under the Darwin namespace and turn it into my own test runner! No need to change ExUnit.

1 Like

I’d like to have a way to determine whether the test failed or not in the ‘on_exit’ callback.

2 Likes

@tmbb Did you settle on a different approach or are you still trying to get something like what you proposed into ExUnit?

I’m still trying to make this work, but unfortunately I cant compile Elixir on my machine supposedly because of memory issues, but probably something else (my machine has plenty of memory), and at the moment I’m not interested in tying to solve the compilation problems. This means I won’t submit any PRs to ExUnit for the time being.

But I’m still trying to find a way of making the test suite stop when the first test fails. I’m running Darwing on the Elixir Enum module (actually a copy of the Enum module called Enom, because if I mutate the real Enum module – which I can do – I break the language). Many mutations cause the functions in the Enum module to enter infinite loops, which is very inconvenient. If I have 60 tests which fail with a timeout of 3s, It’s ~3mins untill all the test finaly fail. This means I waste 3mins killing that particular mutation (Darwin fins ~300 mutations in the Enum module). If I could fail the whole test suite after the first test, I could kill the mutation in 3s…

1 Like

On a more positive note, I’ve just noticed I can detect whether a test fails using a Formatter. So, if there’s a way of killing the test suite “remotely”, I could do that from the formatter… I have to look into it.

2 Likes

On a negative note again, it’s NOT possible to kill the test suite inside the formatter because the formatter is async… This means I can’t guarantee I’m closing the right test suite. It’s possible that Darwin is already on the next test suite, and killing it causes errors.

This is my conclusion for running a very primitive “test suite killer” in preactice…

1 Like

Have any of the ExUnit developpers looked at this in the mean time? I don’t think ExUnit as it is written now can be changed into something that suppoerts my Feature Request Nr. 2. Given that no one else is asking for this feature, I guess I should write my own test framework then. Pinging @josevalim

I don’t think you want Feature Request Nr. 2 in practice. The only way to stop a test is by killing it. However, by killing a process, it can affect other parts of the system, for example other linked process. So if your goal is to stop the suite to start another run, the next run may be in a different state by the sudden killing of the processes.

Since you are doing your own mutation thing, what I would do is to have a “mutation server”. Once a mutation run fails, notify the mutation server. Before you try each mutation, ask the mutation server if you should continue running. Have you considered this approach?

4 Likes

What you propose is not really applicable, because I want to run the test suite for all mutations (that’s the general goal of mutation testing). The moment where I want to interrupt the test suite is within each mutation, after one of the tests has failed.

But you’ve given me an idea: I could have a central mutation server that keeps track of whether any test has failed. Then, I could rewrite the test macro in such a way that instead of generating a function that runs the test, it could generate a function that asks the server whether the test should run, and only run it in that case.

For example:

test "name" do
  # test body
end

should expand into something like this:

def name() do
  if MutationServer.at_least_one_failure?() do
    # Don't do anything
    :ok 
  else
    # test body
    :ok
  end 
end

The overhead it adds is minimal: a single function call plus a GenServer call or even an ETS lookup. The server can be updated after each test has been run using a Formatter or something like that.

That way I can make it work without any changes to ExUnit. The new test macro could be injected with somerthing like:

defmodule MyTest do
  use Darwin.Case, async: true
  
  # The `test` macro is now Darwin.test/2
  test "my test name" do
    # ...
  end
end
``
1 Like

This works beautifully! Test time for the (customized) Enum module has decreased from > 3h to < 2min. Thanks @josevalim!!! If I can’t kill the test suite prematurely, I just have to guarantee that the tests run after the first failure don’t actually run any code (they just return the :ok atom). The disadvantage is that now the user has to use Darwin.TestCase instead of use ExUnit.Case, but that’s an acceptable tradeoff at this stage.

Another question: Is there a way of skipping a test at runtime, say, with a special exception? I’d like to be able to skip tests after the first failure, instead of marking them as successes. This is purely cosmetic, though. I’m already pretty happy with what I have so far.

2 Likes

Glad to hear!

You may be able to by returning {:ok, skip: true} in your setup. If not, we could make that work.

3 Likes

Btw, I just realized that if the skip approach works, then you don’t need to hijack the test macro, you just need your own setup injected there.

Maybe, but I’ll have to use Darwin.TestCase anyway, right? I can’t inject a custom setup from outside the test case.

Either that or have them:

import Darwin.Setup
setup :mutation_filter

Or similar.

Revisiting Feature 1 and Feature 2

Feature 1

This feature is still very useful. I’m now injecting custom code in my test cases (use Darwin.TestCase), so in theory I could keep track of the modules myself, but it’s a huge duplication of work and I haven’t managed to make it work well in practice (ExUnit’s architecture is quite complex, and for a god reason, but it makes it very hard to run test when we want them to…).

I’ll see if I can get around these limitations in a sane way. Now that I’vde decided I’ll be injecting darwin into the test modules I have other possibilities I didn’t have before

Feature 2

Things are working out pretty well without feature 2, so I can definitely do without it. OTH, I now have a way of implementing this feature (at least conceptually). ExUnit would just have to add some extra logic to tests and spawn a central server that collects test failures until the cutoff is reached. Before running each test, it asks the central server whether the test should run, and doesn’t run it if the cutoff has been reached. This should be done “outside” the test, unlike my solution which works “inside” the test, so that the setup blocks aren’t run (they might take a long time in case of integration tests).

The catch is that I don’t know if this will be generally useful, and I can’t justify implementing this myself for my own use case, because I have managed to work around it. So unless something else needs this feature (mostly other test framework writers, I guess), I won’t be working on this one.

Feature 1 would be ok but I am not sure when you would call it? You can’t call it after the suite, because we don’t keep them around. There is no hook for you to call before the suite starts. So you would have to find a way to keep them but I am almost sure it is a breaking change. If you can contribute an API, such as ExUnit.loaded_modules, in a way it doesn’t break ExUnit’s suite, then we will gladly accept it.

Feature 2 should be trivial to implement. Store all of the “module runners pid” and as soon as we reach max failures, you send all module runners are message. Before running each test, a module runner should check if it received said message or not. PR also welcome.

1 Like

You don’t need a special hook. You can just do this (adapted from real code that works in Darwin):

def add_modules_to_ex_unit_server() do
    test_files = Path.wildcard("test/**/*_test.exs")
    Kernel.ParallelCompiler.compile(test_files)
  end

This adds the modules into the ExUnit.Server without running the test suite.

It doesn’t have to be a breaking change… The state could be something like:

def init(:ok) do
    state = %{
      loaded: System.monotonic_time(),
      waiting: nil,
      async_modules: [],
      sync_modules: [],
      modules_but_these_are_not_consumed: %{
        async_modules: [],
        sync_module: []
      }
    }

    {:ok, state}
  end

One would have to adapt the genserver calls accordingly, but from the point of view of the user, the API doesn’t change.

I’d love to contribute to ExUnit, but I’m stuck using WSL on windows and elixir compilation fails with memory issues I can’t do anything about (the problem happens when compiling the Unicode module)