Muex - the library for mutation testing

Thanks to Brian @bglusman who provided a very valuable PR to this library, I am to introduce the mutation-testing library for BEAM languages named Muex.

Why another one, when we already have Darwin and Exavier? Well, because I like to write code (more formal comparison can be found here.)

This library is also a part of the Language Agnostic Code Audit SaaS effort, MIT-license. Here is the repo:


Example outcome:

❯ mix muex --files "examples/cart/lib/cart/*.ex" --verbose
Loading files from examples/cart/lib/cart/*.ex...
Found 2 file(s)
Analyzing files for mutation testing suitability...
  ✓ examples/cart/lib/cart/product.ex (score: 100)
  ✓ examples/cart/lib/cart/shopping_cart.ex (score: 100)
Selected 2 file(s), skipped 0 file(s)
Generating mutations...
Applying mutation optimization...
Original mutations: 1541
Optimized mutations: 31
Reduction: 1510 (-98.0%)
Average impact score: 6.9
Testing 31 mutation(s)
Analyzing test dependencies...
Running tests...

×××·×××××·×××
---×××·×××××××××××


Mutation Testing Results
==================================================
Total mutants: 31
Killed: 3 (caught by tests)
Survived: 25 (not caught by tests)
Invalid: 3 (compilation errors)
Timeout: 0
==================================================
Mutation Score: 9.68%


Survived Mutations:
--------------------------------------------------
examples/cart/lib/cart/product.ex:144
  Boolean: and to or

…

** (Mix) Mutation score 9.68% is below threshold 100%
14 Likes

Thanks for open sourcing this, I’m playing with it now. It great to have this in Elixir.

1 Like

v0.6.0 is out, all kudos go to @bglusman

Fixes:

  • classify sandbox compilation errors as :invalid, not :killed
  • auto-discover mutators + StatementDeletion & ReturnValue mutators

Features:

  • parallel cross-file mutation testing with sandbox isolation

Very interesting project! Thank you. I left an issue on Github because it seems to have problem compiling my project that uses macros.

1 Like

Yeah, thanks! I am on it already.

1 Like

I had Claude Code left you a comment too, which may or may not be useful :smiley:

1 Like

Nah, the thing is I overestimated my ability to deal with a dozen of projects faster than my AI assistant (I am kinda competing with her on that matter :slight_smile:)

Yesterday I messed up merging new features and in a rush I did not validate it 102%. Stay tuned, the working version is to be there soon.

1 Like

v0.6.1 is out, thanks @lud for testing and reporting the bug.

Fixes:

  • Less aggressive optimization without premature Code.compile_string/2
1 Like

I’ll try as soon as possible :slight_smile:

1 Like

Hello,

So I let it run with mix muex --optimize --optimize-level aggressive --verbose and it still takes a lot of time. I believe JSV was a very bad match because there are 7000 tests. I’ll try to move the tests so I can target only the right subset (for now I have a generated/ tests directory inside test/jsv/ so I cannot target "all files except those in generated. But it could be nice to be able to pass exclusion flags to mix testto exclude a@moduletag`.

Then about the results I’m not so sure how to interpret them.

Survived Mutations:
--------------------------------------------------
lib/jsv/vocabulary/v7/applicator.ex:86
  ReturnValue: replace return value of validate_keyword with nil

If I return nil from the function defined I have 52 test failures when running manually.

lib/jsv.ex:374
  StatementDeletion: delete statement 63 of 138

This is a @doc group @doc_group. I also have results about removing a @doc false statement. Those should be ignored.

lib/jsv/resolver.ex:276
  StatementDeletion: delete statement 36 of 78

I’m not sure what is statement 36:

   283	  defp scan_subschema(list, parent_id, nss, meta, path, acc) when is_list(list) do
   284	    list
   285	    |> Enum.with_index()
   286	    |> EnumExt.reduce_ok(acc, fn {item, index}, acc ->
   287	      scan_subschema(item, parent_id, nss, meta, [index | path], acc)
   288	    end)
   289	  end

Some results file:line point to a defstruct statement. The app does not compile if I remove them.

So in the end it’s not really helpful for now with all those delete statement X of Y because I don’t know what that is. It could be really helpful to have a report file with patches so we can see what was the mutation, reproduce it, write a test to invalidate.

I’m trying to see if the html report has this, but I can’t find a file to generate actual mutations. I set a “limit 30” but it only generates “invalid mutations”. I’ll try to re-run the full output (still with aggressive though) later.

But it was already helpful in some way, so far I know that I have to test a @derive {Inspect, ... statement that is not tested at all.

So I believe this is promising but you need a way to make prioritize the mutations in some way.

Yes. And this is unavoidable. We (@bglusman and myself) put a lot of effort to the sophisticated optimization, but Elixir is still a compiled language.

That’s exactly what’s planned next.

Absolutely! Thanks for helping to test it!

BTW, this report means:

  • mutation kind: ReturnValue
  • mutation itself: we returned nil instead of what has been returned
  • failure: test has still had passed

Well then there is a problem because I added nil at the bottom of the function and 52 tests broke, so I’m not sure why it reported it as a positive.

Is there a way to run muex and tell it to run only with that precise mutation?

Other tests, I suppose. The methodology is not perfect, and I know that. OTOH, the ultimate goal is to run it one the newly added code and tests subset.

It’s planned. Currently you might do mix muex --files "lib/my_module.ex" --test-paths "test/my_module_test.exs".

Also, passing --verbose might shed some light too.

Yes I was running with verbose all the time. I need to move my tests to be able to target them better, but in fine JSV might be too test-heavy for mutation testing.

On the optimiztion side, could you describe your strategy? When running claude for the bug report I saw it was debugging symlinks, so I guess you are copying the source code in a temporary dir, adding the mutation, symlink all dependencies build to _build/test/lib/<each-dependency> and call mix test?

Copying the source code is not the trick, the trick is to avoid recompilations whenever possible.

Unlike ruby or python, where you just change the code and run the mutation, recompiling elixir module might result in recompilation of the whole project. So we group tests by mutations in code, we avoid mutating the whole bunch of file kinds like behaviours etc, Mutation Optimization — Muex v0.6.1

1 Like