Muzak & Muzak Pro - Mutation testing for Elixir

I’ve just made Muzak and Muzak Pro available!

Muzak is a mutation testing library for Elixir applications, and Muzak Pro is the full-featured version of Muzak designed for business use. You can find more information about Muzak and Muzak Pro in this announcement post: https://devonestes.com/announcing_muzak

And here are two videos, one showing getting started with Muzak: https://www.youtube.com/watch?v=3WU94iVhc9w
and a second showing getting started with Muzak Pro and some of the features included in it: https://www.youtube.com/watch?v=P301R28IuTI

I’d love to hear what folks think!

10 Likes

Great timing! I’ve just been toying with the idea of doing some mutation testing workshops/demos in various languages, and have been checking out the tools.

The randomness of the subset in the free version is an interesting tweak. Makes it hard to tell for sure if you’ve killed some particular mutant, if it’s generating the full 25. But, if we don’t care too much about that, and just keep picking random mutants and killing them (which is my intended approach so far in the main part of the demo), we should eventually get down to under 25. :slight_smile:

BTW there’s an active discussion on the mutation testing discord channel (I @-mentioned you in a thread about it on Twitter and I think Markus Schirp posted an invite code there), about terminology, such as better terms for “killed”, that won’t turn so many people off.

Thank you @devonestes for this tool!

I maintain an open-source protobuf library (protox) for which I do everything I can to make it 100% reliable. Hence my interest ;-).

I gave it a try, and as expected, it’s quite slow :wink:.

I have a question: the documentation mentions muzak --only path/to/file.ex, should I apply it on a test file or a library code file?

Edit: I deactivated properties tested with propcheck as it was much too slow (I killed mix muzak after ~30 minutes). It ended with

FFFFFFFFFFFFFFFFFFFFFFF..

     lib/protox/float.ex:30
     original: @positive_infinity_64 <<0, 0, 0, 0, 0, 0, 0xF0, 0x7F>>
     mutation: @positive_infinity_64 <<0, 0, 0, 0, 0, 0, 4_932_958_568, 0x7F>>

     […]

     lib/protox/float.ex:4
     original: defmacro __using__(_) do
     mutation: defmacro nhbysykbgz(_) do


Finished in 93.7 seconds
25 mutations run - 23 mutations survived

So, I’m not sure how I should interpret this? Does it mean that out of 25 runs, 23 failed because my tests didn’t detect the mutation?

Also, what’s the reason for mutating __using__() as it’s a standard name?

Edit2: now that I took a look at your videos, it’s much more clear to me. However, I don’t get why renaming __using__ doesn’t end up in a compilation failure as I use this macro in several places :thinking:. Furthermore, when I apply one of the mutation manually (the first one in the extract above), my tests fail as expected. So, why did muzak says this mutation survived? What am I missing?

So, I’m not sure how I should interpret this? Does it mean that out of 25 runs, 23 failed because my tests didn’t detect the mutation?

That is what it’s saying, but I also think I see what might be the issue. I have a feeling that after we mutate and recompile lib/protox/float.ex, the other files that had called that macro don’t get recompiled, which is why it doesn’t end in a compilation error. I had thought that this wouldn’t happen, but I’ll put together a test and then push up a fix if that is indeed what’s happening.

1 Like

Ok, I’ve just published version 1.0.1 which should resolve that issue. Now when __using__/1 is mutated it should cause compilation errors.

Maybe this is really obvious from the source code, but do you recompile the source code for each mutation or have you found a way of avoiding recompilations somehow? I have my own mutation testing library where I’ve found a way of recompiling the code only once (resulting in faster times for the test suite) at the cost of not being able to test macros and making the resulting mutations slightly harder to understand for the user.

I recompile the file that’s been mutated and any compilation dependency on any module that’s been mutated. So, if A imports a macro from B, and we mutate B, then we’ll recompile B and then A. I’m just recompiling in those cases with Code.compile_string/1, and if any of those compilations fail then I consider the mutation discovered and pass.

Don’t you get memory allocation errors when you compile the same module say, 300 times or something like that? I used to get that problem when I was recompiling the code per mutation.

Nope! I do a bit of cleanup after each run (to clean up some ExUnit stuff and some compiler stuff that normally would leak out between runs). This might help:
https://hexdocs.pm/elixir/Code.html#purge_compiler_modules/0

Thanks for the fix! I still have some questions :slightly_smiling_face:

  1. Quite often, I have a compilation error about a struct that has not been found, which interrupts the mix task. Shouldn’t it be caught by muzak?
  2. When I use the option --only (for instance on the aforementioned file lib/protox/float.ex), I always get 0 run - 0 mutations survived. I even tried to make it run in a bash while loop to see if it’s an effect of randomness, but I always obtain the same result. Does it mean the file cannot be mutated?
  3. On my laptop, the protox test suite takes ~30s. But most of the time, even when I let mix muzak run for hours, I’ve got only one or two dots displayed. Do you think it means the mutation triggers some kind of infinite loop?

Hey, so I’ve looked at protox a bit, and I’ve found a few things:

  1. I was able to reproduce some of those compilation errors, and they should have been caught and counted as a successful run (since the mutation was caught), so I’ll push up a fix for that in 1.0.2 in a little bit - thanks for the report there!
  2. I wasn’t able to reproduce this locally when I cloned down protox to debug this. Is it possible you weren’t in the root directory when running that so the path was incorrect? This reminds me - I should probably check that the file exists when folks use --only so they can get some better feedback, so this is still super helpful!
  3. It looks like there’s a process that can be killed by some mutations, and when that process gets killed the tests don’t run. Here’s the mutation that caused the process to be killed:
%{
  line: 26,
  mutation: ["def", " ", "dvysgsomhx", "(", "", ":string", "", ")", ",", " ",
   "do:", "", " ", "\"", "", "\""],
  original: ["def", " ", "default", "(", "", ":string", "", ")", ",", " ",
   "do:", "", " ", "\"", "", "\""],
  path: "lib/protox/default.ex"
}
nonode@nohost: "Mutating file"
warning: clauses with the same name and arity (number of arguments) should be grouped together, "def default/1" was previously defined (nofile:13)
  nofile:27

nonode@nohost: "Mutating completed"
nonode@nohost: "No compile dependencies to recompile"
nonode@nohost: "Tests starting"
Excluding tags: [conformance: true, properties: true]


== Compilation error in file test/protox_test.exs ==
** (FunctionClauseError) no function clause matching in Protox.Default.default/1    
    
    The following arguments were given to Protox.Default.default/1:
    
        # 1
        :string
    
    (protox 1.2.2) nofile:13: Protox.Default.default/1
    (protox 1.2.2) lib/protox/parse.ex:295: Protox.Parse.get_kind/3
    (protox 1.2.2) lib/protox/parse.ex:212: Protox.Parse.add_field/5
    (protox 1.2.2) lib/protox/parse.ex:203: Protox.Parse.add_fields/4
    (protox 1.2.2) lib/protox/parse.ex:179: Protox.Parse.make_message/4
    (protox 1.2.2) lib/protox/parse.ex:164: Protox.Parse.make_messages/4
    (protox 1.2.2) lib/protox/parse.ex:128: Protox.Parse.parse_file/2
    (protox 1.2.2) lib/protox/parse.ex:108: Protox.Parse.parse_files/2
    (protox 1.2.2) lib/protox/parse.ex:18: Protox.Parse.parse/2
    (protox 1.2.2) expanding macro: Protox.__using__/1
    test/protox_test.exs:43: ProtoxTest (module)
    (elixir 1.11.0) expanding macro: Kernel.use/2
    test/protox_test.exs:43: ProtoxTest (module)
    (elixir 1.11.0) lib/kernel/parallel_compiler.ex:416: Kernel.ParallelCompiler.require_file/2
    (elixir 1.11.0) lib/kernel/parallel_compiler.ex:316: anonymous fn/4 in Kernel.ParallelCompiler.spawn_workers/7
nonode@nohost: "Tests finished"

08:12:28.398 [error] GenServer #PID<0.834.0> terminating
** (stop) killed
Last message: {:EXIT, #PID<0.833.0>, :killed}
State: %DynamicSupervisor{args: {:ok, %{extra_arguments: [], intensity: 3, max_children: :infinity, period: 5, strategy: :one_for_one}}, children: %{#PID<0.836.0> => {{GenServer, :start_link, :undefined}, :temporary, 5000, :worker, [GenServer]}, #PID<0.837.0> => {{GenServer, :start_link, :undefined}, :temporary, 5000, :worker, [GenServer]}}, extra_arguments: [], max_children: :infinity, max_restarts: 3, max_seconds: 5, mod: Supervisor.Default, name: {#PID<0.834.0>, Supervisor.Default}, restarts: [], strategy: :one_for_one}

I think this might actually be a really interesting piece of feedback that is being surfaced by muzak - that there’s a way to crash the application that puts it into a potentially unusable state. I’ll have to have a think about how best to capture this feedback and present it to the user so they can use it to improve their application. My first thought is to maybe allow the user to set a timeout for each mutation that, if passed, would mean something like “something has gone very wrong with your application so we’re going to print out a bunch of information about what happened while we were running mutation tests so you can figure out what happened to your application.”

How that output would look, however, is a tricky question that I’ll need some time to consider.

Ok, I’ve also looked deeper into this as well, and the issue here is in how you’re using test_helper.exs. Because you’re doing some compilation in there of what are essentially test fixtures, and (at the moment) Muzak doesn’t re-require test_helper.exs before each test run for each mutation, those test structs aren’t being re-compiled with each mutation. They’re only being compiled once with the un-mutated code.

This is because the types of things I’ve seen put in test_helper.exs are either things that only need to be done once and aren’t affected by any possible mutations (like configuring ExUnit), or things that aren’t idempotent (like inserting test fixtures into a DB) and so can’t be run multiple times. That’s also because I mainly used applications and not libraries as tests for Muzak, and libraries have way more macros and compile-time stuff going on usually.

This might end up being another necessary configuration option, but I do think this might end up being something that will be good to support. I’ll need to do a bit more research first and thinking to come up with a better way to handle this.

Thank you for this detailed analysis! As soon as you have released 1.0.2, I’ll give it a try :slight_smile:.

I moved code out of test_helper.exs directly into test files that really needs it (it is much cleaner this way), but it doesn’t seem to change much.

  1. about --only: yes I’m in the root directory of protox. I also check the path with ls, which correctly sees the file.
  2. As my library doesn’t spawn any process, so does it mean a process launched by ExUnit gets stuck? Also, in the sample you provided, it looks like a compilation error, so maybe once you fix point 1, it should not happen anymore?
    Generally speaking, yes, I think it would be a great idea to have some kind of configurable timeout.

Anyway, if you need any more feedback or some kind of beta-tester, I’ll gladly help :wink:

I’ve just started trying Muzak (v1.0.2), and am having a problem. I’ve got some very simple code and a pretty much empty test suite. The code runs fine, the “test” runs fine when I do mix test, but when I do mix muzak I get (RuntimeError) cannot capture_log/2 because the :logger application was not started.

This is coming from ExUnit, not Muzak, but since the tests run fine without Muzak I’m thinking it may be something Muzak is doing or assuming – such as something that I, an Elixir semi-n00bie, didn’t know I was supposed to do. :wink: I’m guessing it’s due to the difference between having a whole application per se, with an OTP supervision tree including a logger app and all that, versus just some code, which is what I’m trying to do to keep it very simple.

Gory details:

  • Call stack:

      (ex_unit 1.11.2) lib/ex_unit/capture_log.ex:99: ExUnit.CaptureLog.add_capture/2
      (ex_unit 1.11.2) lib/ex_unit/capture_log.ex:70: ExUnit.CaptureLog.capture_log/2
      (muzak 1.0.2) lib/muzak/runner.ex:202: Muzak.Runner.run_silent/1
      (muzak 1.0.2) lib/muzak/runner.ex:37: Muzak.Runner.run_mutation/3
      (muzak 1.0.2) lib/muzak/runner.ex:15: anonymous fn/4 in Muzak.Runner.run_test_loop/2
      (elixir 1.11.2) lib/enum.ex:2181: Enum."-reduce/3-lists^foldl/2-0-"/3
      (muzak 1.0.2) lib/muzak/runner.ex:13: Muzak.Runner.run_test_loop/2
      (muzak 1.0.2) lib/muzak.ex:7: Muzak.run/1
    
  • Googling the error message gets me absolutely nothing other than the PR to ExUnit where that message was put in place. It seemed too much digging to figure out what the error message may have been before that, so as to Google it, but since the PR was merged 2017-10-07 I figured relevant results would be fairly likely to be recent.

  • I tried all the relevant-seeming things I could find or think of, mainly:

    • adding capture_log: false to the ExUnit.start() in test_helper.exs
    • putting config :conway, disable_logging: true in test.config
    • removing :logger from extra_applications in mix.exs
    • adding env: [ capture_log: false] in def application in mix.exs

    (Unfortunately I’m still not very familiar with the intricacies of mix.exs.)

  • The test suite contains only a meaningless test (test "nothing", do: assert true). I’m toying with the idea of “run an MT tool, pick a random survivor to kill, lather rinse repeat”.

  • The code is just a simple module with a few pattern-matching versions of the same function. Just in case it would be useful:

    defmodule Conway do
      def next_state(true, n) when n in [3,4], do: true
      def next_state(true, _), do: false
      def next_state(false, 3), do: true
      def next_state(false, _), do: false
    end
    
  • I’ve been learning Elixir for years now in my Copious Free Time, and one of the things putting a big dent in my CFT lately is… playing with mutation testing. :slight_smile:

Hi Dave!

Is your code public somewhere? I can try debugging this issue if it is.

Cheers,
Devon

I’ve put it at

Thanks!

Thanks for that! I’ve just pushed v 1.0.3 which fixes that issue.

1 Like

Oh, so it wasn’t some n00bish mistake I was making after all? Whew! I’ll go check it out and pester you if I’m still having trouble. :slight_smile: Thanks again!

Just tried it. Works fine, though I was hoping for more mutants – maybe they’d be generated by the Pro version’s extra mutators. See you online tomorrow afternoon at the Meetup. :slight_smile: