Mutation testing - Mutating BEAM bytecode

tmbb · April 30, 2019, 6:03pm

I’ve decided to look into mutation testing. There are two ways of mutating Elixir code so that we can introduce random bugs in our test suites. The first one is by manipulating Elixir AST, but that has two disadvantages:

It doesn’t work with erlang modules
It’s hard to manipulate an Elixir module from the outside
It requires mutating and compiling Elixir AST a lot of times, which sounds really slow

The obvious solution is to mutate BEAM bytecode or maybe the BEAM abstract code directly. I can extract the BEAM abstract code from the appropriate BEAM chunks for the given module. My question is:

Is there a way of compiling the tuples in the abstract code section into a BEAM module and reload the code? Is this a good approach or is the BEAM abstract code an implementation detail on which I shouldn’t depend?

Maybe I should use the bytecode instead? The question of how to compile the mutated module also applies to the bytecode too.

hauleth · April 30, 2019, 9:29pm

Yes, there is such way. You can look into :cover module for example how it is done (because it is exactly how :cover works, it inserts additional code into AST to track which lines are executed and how much). I was thinking about writing something like that on my own though.

About “reloading” code, then as soon as you compile your module it will replace the old code.

tmbb · April 30, 2019, 10:40pm

I’m trying to insert or delete code at the level of the BEAM bytecode and compile that. Not at the level of the Elixir or Erlang bytecode (even though working on that level might be a better idea)

hauleth · April 30, 2019, 10:52pm

If you compile with debugging symbols then there is AST in the BEAM files as Abst chunk. Check out beam_lib documentation.

tmbb · April 30, 2019, 11:31pm

I know that, but that still doesn’t answer my question… I already know where to get the AST and the bytecode from. I want to be able to compile either the Erlang AST (available in the Abst chunk) or the BEAM byte code.

michalmuskala · May 1, 2019, 10:21am

FWIW compiling from the Erlang Abstract format won’t be any slower than compiling from the Elixir AST - there’s generally little overhead compared to other compiler passes.

I think there are overall 3 places that make the most sense for something like this to hook into:

Elixir AST - this is the closest to the real code and you’re guaranteed that by mutating it, you always produce code users could write. You can also turn the mutated code into source and display to the users for debugging.
Core Erlang - this is an intermediate compiler language designed specifically for ease of machine processing. It has all implicit branches explicit (like a final catch-all clause in case that raises the case clause error), variable scoping is explicit (this makes Erlang AST complex for machine processing - variable scoping rules are really weird), etc. It’s also relatively close to Erlang and Elixir so it would be possible to show some representation to the user that made sense for debugging.
Bytecode/assembly (assembly being a term representation of the bytecode) - this will be the fastest to produce “mutants”, but it’s also extremely easy to produce invalid code or code that you could never write in Elixir or Erlang. It’s also almost impossible to show a meaningful representation of the mutated code back to the user

For Core Erlang or Bytecode you can compile using the compile:forms/2 with from_asm or from_core options (or from Erlang AST without any options).

tmbb · May 2, 2019, 9:09pm

after some thought, I would add a 4th place: mutating the fully macroexpanded Elixir code. This has the disadvantage that one might be changing things unrelated to what the user would ever write (we’d be changing the macro exapansion of macros in unrelated modules), but it has a big advantage. It allows me to generate mutant schmata. An example in Java:

class MutateMe {
  public static boolean MUTANT_1;
  public static boolean MUTANT_2;
  public static boolean MUTANT_3;

  public boolean aMethod(int i) {
    if (MUTANT_1) {
      return i > 10;
    } else if (MUTANT_2) {
      return i == 10;
    } else if (MUTANT_3) {
      return i != 10;
    } else {
      // original code
      return i < 10;
    }
  }

}

This allows us to gather a lot of mutations in the same file. This reduces the time we need to compile the mutations. Compiling even a small Elixir module takes ~10ms on my machine. I haven’t tested it yet, but I bet compiling a module with 10 embedded mutations will take much less than compiling 10 modules with a single mutation.

One can use mutant schemata with Elixir code that hasn’t been completely expanded, but in that case one must be more careful when inserting case statements in places that expect a certain form of AST to work with.

This format is also really tempting because it’s probably easy to insert mutant schemata.

tmbb · May 2, 2019, 9:16pm

The problem with muant schemata is that they might cause compilation errors, and in that case it’s not obvious which mutant one must remove from the file. This means that “mutant case statements” should only be inserted in “safe places”. Maybe the best approach is a mix of “mutant case statements” and “normal” ast manipulation.

tmbb · May 2, 2019, 11:35pm

An example of a mutant schema for Elixir (from my experimental library named Darwin).

iex(87)> Darwin.mutations_for(1, quote(do: 1 + 2))

00:32:41.569 [debug]
(case(Darwin.ActiveMutation.get()) do
   1 ->
     &Kernel.-/2

   2 ->
     &Kernel.*/2

   3 ->
     &Kernel.//2

   _other ->
     &Kernel.+/2
 end).(
  case(Darwin.ActiveMutation.get()) do
    4 ->
      -1

    5 ->
      0

    6 ->
      2

    _other ->
      1
  end,
  case(Darwin.ActiveMutation.get()) do
    7 ->
      -1

    8 ->
      0

    9 ->
      1

    _other ->
      2
  end
)

This generates 9 different mutations. Some of them mutate the arithmetic operator, and some of them mutate the integer arguments. The function call Darwin.ActiveMutation.get() gets the index of the active mutation from an Agent (in this case an integer from 1 to 9). This allows us to pick the right mutation at runtime, without the need to compile the file more than once.

josevalim · May 3, 2019, 7:44am

Can you generalize this idea and pass anonymous functions around? The anonymous function would receive the whole environment and return the new environment. This way you have absolute control without having to compile the code again. Plus this kind of wrapping could be used could be used for things like cover (in cover you would execute the original code but count is as annotated). You just need to make sure that the code you are lifting out to an anonymous function does not define any variable, as that would change the semantics. I think the core pass already lifts all of the variables from the middle of the expressions.

tmbb · May 3, 2019, 7:14pm

I don’t understand your question. I’m already generating anonymous functions. The case statement returns an anonymous function, which I call on the arguments. Could you please explain what you mean in a different way?

So you’re saying I should work in Erlang Core? Currently I’m working with Elixir’s AST, which is much nicer, but Erlang Core has the advantage that it has fully qualified function names and it’s much easier to mutate in valid ways. In elixir, everytime I mutat something inside a macro I can cause compilation errors… And with this approach (mutant schemata), I can’t have any compilation errors… Is it easy to map line numbers from Core Erlang to Elixir? Is there a parser from which I can get the AST of Core Erlang? Or is that just a subset of the normal Erlang syntax? I can’t find the answers to these questions in the docs.

Even working with fully macro-expanded elixir is already much nicer in terms of mutations, and maybe there’s no need to go deep into Erlang Core.

I’ve added support for other mutations and made it so that the returned AST is simpler. You can see here an example from the test cases:

  test "operator: ||" do
    assert quote(do: x || y) |> Darwin.mutate() |> Darwin.format() == """
           (fn a, b ->
              case(Darwin.ActiveMutation.get()) do
                0 ->
                  a && b
                _other ->
                  a || b
              end
            end).(x, y)
           """
end

You can find the full test suite (so far; the tests don’t cover all mutations yet) here: https://github.com/tmbb/darwin/blob/master/test/darwin_test.exs.

The naïve way of doing this would be to recursively wrap every statement into it’s own function, which would notify a genserver somewhere that the line has been run. It oes have the problem you mention, which is the fact that variable definitions can get out of scope.

I guess the best solution would be to use blocks on the fully macroexpanded source code. Something like an AST transform that transforms this:

def f() do
  x1()
  x2()
  x3()
  x4()
end

into this:

def f() do
  (
    x1()
    CoverageServer.notify_line(line_1)
  )
  
  (
    x2()
    CoverageServer.notify_line(line_2)
  )

  (
    x3()
    CoverageServer.notify_line(line_3)
  )

  (
    x4()
    CoverageServer.notify_line(line_4)
  )
end

tmbb · May 3, 2019, 9:41pm

For both coverage and AST mutation, the hardest thing of working with Elixir AST is that macros can depend on the literal AST, and sprinkling case statements inside macros is not a good idea. The problem of using Core erlang or something lower level is that coverage reports become harder to generate and interpret…

Another problem is that it’s no very easy (if possible all) to get the fully expanded Elixir AST for a module from the outside of the module. The core erlang AST can be extracted from the BEAM files, and I can manipulate that, but that requires:

Learning to work with Erlang’s AST (which isn’t as nice as Elixir’s) and
Translating the changes from the Erlang code to Elixir

This last point is probably not too bad because the changes we use in mutation testing are supposed to be very simple. The PITest mutation testing framework deals with this by not showing the mutated line. It shows a textual description of what it has done instead. For example: “Negated condition”, “Changed conditional boundary” and stuff like this.

tmbb · May 4, 2019, 11:05pm

Let’s see an example of how simple it is to generate mutations on the Erlang AST. We want to replace the + operator in the expression a + b. To help generate Erlang AST expressions (something which is unreasonably complex using the default functions in the erlang compiler library), I’ve written the Darwin.expression!/2 function, which does exactly that.

iex(1)> expr = Darwin.ErlUtils.expression!("a + b.")
{:op, 1, :+, {:atom, 1, :a}, {:atom, 1, :b}}
iex(2)> Darwin.mutate(expr)

23:18:37.812 [debug] Darwin.mutate/1
>>> Input:
a + b


>>> Output:
fun (_darwin_a@1, _darwin_b@1) ->
        case
          'Elixir.Darwin.Mutator.Helpers':do_get_active_mutation()
            of
          1 ->
              'Elixir.Darwin.Mutator.Helpers':do_arith_sub(_darwin_a@1,
                                                           _darwin_b@1);
          2 ->
              'Elixir.Darwin.Mutator.Helpers':do_arith_mul(_darwin_a@1,
                                                           _darwin_b@1);
          3 ->
              'Elixir.Darwin.Mutator.Helpers':do_arith_div(_darwin_a@1,
                                                           _darwin_b@1);
          _ ->
              'Elixir.Darwin.Mutator.Helpers':do_arith_add(_darwin_a@1,
                                                           _darwin_b@1)
        end
end(a, b)


{:call, 0,
 {:fun, 0,
  {:clauses,
   [
     {:clause, 0, [{:var, 0, :_darwin_a@1}, {:var, 0, :_darwin_b@1}], [],
      [
        {:case, 0,
         {:call, 0,
          {:remote, 0, {:atom, 0, Darwin.Mutator.Helpers},
           {:atom, 0, :do_get_active_mutation}}, []},
         [
           {:clause, [generated: true, location: 0], [{:integer, 0, 1}], [],
            [
              {:call, 0,
               {:remote, 0, {:atom, 0, Darwin.Mutator.Helpers},
                {:atom, 0, :do_arith_sub}},
               [{:var, 0, :_darwin_a@1}, {:var, 0, :_darwin_b@1}]}
            ]},
           {:clause, [generated: true, location: 0], [{:integer, 0, 2}], [],
            [
              {:call, 0,
               {:remote, 0, {:atom, 0, Darwin.Mutator.Helpers},
                {:atom, 0, :do_arith_mul}},
               [{:var, 0, :_darwin_a@1}, {:var, 0, :_darwin_b@1}]}
            ]},
           {:clause, [generated: true, location: 0], [{:integer, 0, 3}], [],
            [
              {:call, 0,
               {:remote, 0, {:atom, 0, Darwin.Mutator.Helpers},
                {:atom, 0, :do_arith_div}},
               [{:var, 0, :_darwin_a@1}, {:var, 0, :_darwin_b@1}]}
            ]},
           {:clause, [generated: true, location: 0], [{:var, 0, :_}], [],
            [
              {:call, 0,
               {:remote, 0, {:atom, 0, Darwin.Mutator.Helpers},
                {:atom, 0, :do_arith_add}},
               [{:var, 0, :_darwin_a@1}, {:var, 0, :_darwin_b@1}]}
            ]}
         ]}
      ]}
   ]}}, [{:atom, 1, :a}, {:atom, 1, :b}]}

The returned expression is Erlang AST, which is of course pretty much unreadable. The interesting part happens in the logs, where I show the corresponding Erlang code. I reproduce it below for clarity:

fun (_darwin_a@1, _darwin_b@1) ->
        case
          'Elixir.Darwin.Mutator.Helpers':do_get_active_mutation()
            of
          1 ->
              'Elixir.Darwin.Mutator.Helpers':do_arith_sub(_darwin_a@1,
                                                           _darwin_b@1);
          2 ->
              'Elixir.Darwin.Mutator.Helpers':do_arith_mul(_darwin_a@1,
                                                           _darwin_b@1);
          3 ->
              'Elixir.Darwin.Mutator.Helpers':do_arith_div(_darwin_a@1,
                                                           _darwin_b@1);
          _ ->
              'Elixir.Darwin.Mutator.Helpers':do_arith_add(_darwin_a@1,
                                                           _darwin_b@1)
        end
end(a, b)

The above is an anonymous function which is called on a and b. The anonymous function uses 'Elixir.Darwin.Mutator.Helpers':do_get_active_mutation() to decide which branch to use. The do_get_active_mutation() function requests the current mutation number from a genserver (using an ETS table is probably slightly faster but it can lead to race conditions, so I’ll stick with a genserver for now). Depending on the mutation number, the anonymous function will pick a branch from the case statement.

The branches of the case statement are a little odd. For example, I’m using 'Elixir.Darwin.Mutator.Helpers':do_arith_sub/2 instead of simply a - b. These functions are just thin wrappers around the Erlang operators (+, -, * and /) because the AST is slightly easier to manipulate that way. Unlike Elixir, where {+, [], args} is a simple to manipulate as {f, [], args}, in Erlang the AST for these expressions is very different, and using named functions makes it simpler to treat operators and functions in the same way (everything is just a function…). Because there are no macros at this point (everything made from simple referentially transparent function calls), using a function or an operator is equivalent.

Sanity check

Now, the truth istaht I don’t have an end-to-end system yet. I’m working from the bottom up first generating mutations, then recompiling the code and then integrating the code into a test suite. The roeadmap is more or less the following:

Implement all kinds of commonly used mutations. This seems to be very simple, as Erlang is very easy to manipulate. In particular, it’s trivial to determine whether the + operator in Erlang is the actual plus sign from mathematics, because it does not depend on whether you’re importing some weird module or whether you’re using it insid e of a macro that actually compiles you code into Java or something like that. This means that mutating the + operator is always “safe” and won’t cause weird compiler errors.

Important question: Should mutations be deterministic? One mutation I want to have is to replace some values (constants, expressions and variables) by constant values. Maybe I should generate random values like StreamData does. Or maybe I should use seom simple values that often break code, like booleans, 1, 0, etc.

Traverse the Erlang AST in order to find places that can be mutated. This is tricky, because the Erlang AST is not as simple as Elixir’s. There are some good docs on the format, so it shouldn’t be too hard. Unfortunately, it seems like Erlang’s AST is actually an implementation detail, so you should be using the erl_syntax module to work with it. This module adds a pretty heavy layer of indirection, which can sometimes stand in the way of actually doing interesting things… I’ll work directly with the Erlang AST until everything is stable enough for me to switch to the supposedly stable erl_syntax module.
Map mutations to lines of code. This seems doable, although I haven’t tried it yet. The Erlnag AST is annotated with location information, so I guess I can make it work.
Integrate the mutation framework with a test suite. The obvious integration point is ExUnit, of course.
Rewrite the whole thing in Erlang, because there isn’t a good reason this should be in Elixir. Rewriting this in Erlang (except for the part that integrates with ExUnit, of course) would make sense, becaus eI’m not working with Elixir’s AST. This means that the quoting and unquoting mechanisms we all love in Elixir are not needed here. This leaves me with few reasons why this should be written in Elixir. The main reason is that the current iteration of Darwin uses some custom macros to cut down on pointless repetition (it cuts the code size into about 1/5 of what it would be without macros), but I can always compile those modules into Erlang and distribute the Erlang code

Regarding point 4 (integration with ExUnit), I’m thinking of something like this:

defmodule MyModuleTest do
  # Mutate MyModule and run it through the test suite
  use Darwin.Test, module: MyModule

  test "some test cases" do
    # ...
  end
end

The use Darwin.Test statement would define customized test macros, as well as other APi compatible versions of ExUnit functions and macros.

The test macro could expand into something like this:

# mutation indices are consecutive integers (they could be anything else, though...)
for mutation_index <- 0..mutation_max_index do
  # Activate the mutation
  Darwin.ActiveMutation.set(mutation)
  # Run the code inside the test case
  # The mutated code knows how to run the correct branch from the active mutation
  # No need to recompile anything!
end
# Turn off all mutations
Darwin.ActiveMutation.set(nil)

As you can see, this avoids recompiling code. It will generate some very large modules, as it must inject code for all mutaions, but it makes testing the mutations very fast! The runtime overhead for each mutation point (each of which can contain several mutations) is a genserver call. It can even cache the mutated modules somewhere so you don’t pay the compilation price for modules you haven’t changed.

Disadvantages

Working with the fully macro-expanded Erlang abstract format means it’s not very easy to map the changes back into Elixir code. For example, I’ll probably never be able to get coverage reports as explicit and informative as these. I’ll be able to report the information on the table to the right, though (line number and kind of mutation performed).

For some perspective, even very simple Elixir modules take about 10ms to compile. If I had to recompile a module each time I wanted to test a mutation, I wouldn’t be able to test more than 10 mutations per second, which is very slow for any reasonable codebase… This could be made faster by starting other Erlang nodes in parallel and distributing the work between the nodes, but it’s probably not worth the effort.

tmbb · May 5, 2019, 12:29am

Now, an example of mutating relational operators (<, <=, ==, …):

iex(8)> Darwin.ErlUtils.expression!("a < b.") |> Darwin.mutate()

01:26:31.883 [debug] Darwin.mutate/1
>>> Input:
a < b


>>> Output:
fun (_darwin_a@1, _darwin_b@1) ->
        case
          'Elixir.Darwin.Mutator.Helpers':do_get_active_mutation()
            of
          1 -> true;
          2 -> false;
          3 ->
              'Elixir.Darwin.Mutator.Helpers':do_relational_greater_than(_darwin_a@1,
                                                                         _darwin_b@1);
          4 ->
              'Elixir.Darwin.Mutator.Helpers':do_relational_greater_than_or_equal(_darwin_a@1,
                                                                                  _darwin_b@1);
          5 ->
              'Elixir.Darwin.Mutator.Helpers':do_relational_equal(_darwin_a@1,
                                                                  _darwin_b@1);
          6 ->
              'Elixir.Darwin.Mutator.Helpers':do_relational_not_equal(_darwin_a@1,
                                                                      _darwin_b@1);
          7 ->
              'Elixir.Darwin.Mutator.Helpers':do_relational_less_than_or_equal(_darwin_a@1,
                                                                               _darwin_b@1);
          _ ->
              'Elixir.Darwin.Mutator.Helpers':do_relational_less_than(_darwin_a@1,
                                                                      _darwin_b@1)
        end
end(a, b)

Besides replacing the operator, we also replace the value of the expression by true and false.

tmbb · May 5, 2019, 1:35am

Some unexpected challenges: instead of compiling a and b into A andalso B, elixir compiles a and b into something like:

case _a@1 of
  false -> _b@1;
  true -> true;
  __@1 -> erlang:error({badbool, 'or', __@1})
end

I can identify these patterns as corresponding to an and operator, but it seems hacky

After solving this problem, the next problem to solve is statement deletion. This is an important kind of mutation, but one we must use with care because we can’t delete statements that define variables! That would cause compilation errors, and we can’t have any compilation errors with this approach…

tmbb · May 5, 2019, 10:40am

The code in the above post corresponds to the or operator. My mistake.

I have slightly extended the mutation engine so that it recursively mutated the aguments of the operators. For example:

iex(27)> Darwin.ErlUtils.expression!("A + (B * C).") |> Darwin.mutate()

10:47:29.974 [debug] Darwin.mutate/1
>>> Input:
A + B * C


>>> Output:
fun (_darwin_a@1, _darwin_b@1) ->
        case
          'Elixir.Darwin.Mutator.Helpers':do_get_active_mutation()
            of
          4 ->
              'Elixir.Darwin.Mutator.Helpers':do_arith_sub(_darwin_a@1,
                                                           _darwin_b@1);
          5 ->
              'Elixir.Darwin.Mutator.Helpers':do_arith_mul(_darwin_a@1,
                                                           _darwin_b@1);
          6 ->
              'Elixir.Darwin.Mutator.Helpers':do_arith_div(_darwin_a@1,
                                                           _darwin_b@1);
          _ ->
              'Elixir.Darwin.Mutator.Helpers':do_arith_add(_darwin_a@1,
                                                           _darwin_b@1)
        end
end(A,
    fun (_darwin_a@1, _darwin_b@1) ->
            case
              'Elixir.Darwin.Mutator.Helpers':do_get_active_mutation()
                of
              1 ->
                  'Elixir.Darwin.Mutator.Helpers':do_arith_add(_darwin_a@1,
                                                               _darwin_b@1);
              2 ->
                  'Elixir.Darwin.Mutator.Helpers':do_arith_sub(_darwin_a@1,
                                                               _darwin_b@1);
              3 ->
                  'Elixir.Darwin.Mutator.Helpers':do_arith_div(_darwin_a@1,
                                                               _darwin_b@1);
              _ ->
                  'Elixir.Darwin.Mutator.Helpers':do_arith_mul(_darwin_a@1,
                                                               _darwin_b@1)
            end
    end(B, C))

tmbb · July 25, 2019, 1:59am

After reading @dnlserrano’s post about his new exavier library (which actually has an end-to-end system which you can run now, and a much cooler name IMO), I’ve decided to work a little more on Darwin. We must remember that Darwin is commited to using mutant schemata (as explained above). This means that, by design, Darwin has the following problems:

I might mutate code deep inside some macro in a way that’s very opaque to the user. I’m not even sure that it will be easy to detect erlang code expanded by macros as compared to code compiled from “normal elixir” code.
I can’t replace operators in guards (for very good reasons, erland doesn’t allow arbitrary functions inside guards). Guards often contain comparison operators and are critical to the control flow of the program.

Point 1 seems unsolvable, except by clever heuristics.

Point 2, on the other hand, seems solvable. Most (all) code with guards can be compiled to code without guards and with nested case statements replacing the guards. Because guards are all nice and functional, they can be manipulated in a safe way (probably killing performance, though). This seems tractable with only a moderate amount of effort.

tmbb · July 25, 2019, 7:46pm

New update:

As you might remember, I used to generate inline functions with large case statements inside the Erlang AST. That makes the generated code a little harder to follow and adds more code to the Erlang file than it has too. I’ve just “discovered” that I can just move all that complexity into Elixir functions, and then simply insert remote calls to those functions in the Elixir code.

So now the generated code is much simpler:

  test "operator: +" do
    # `mutate/1` is a helper that inserts code as if the module were named 'Elixir.MyModule'
    {abstract_code, ctx} = mutate("A + B.")
    # Assert that we generate the correct erlang code.
    # We could compare the AST instead, but visual inspection of the code is more informative.
    assert ErlUtils.pprint(abstract_code) == """
           'Elixir.Darwin.Mutators.Default.OpAddMutator':'__do_mutate__'('Elixir.MyModule',
                                                                         0, A, B)
           """

    # Assert the correct number of mutations is generated.
    # One could compare the set of mutations, but that seems a bit useless...
    # We'd be repeating an enormous amount of code for little benefit.
    # Maybe one day (when things are more stable we should add it)
    assert Context.nr_of_mutations(ctx) == 5
  end

You can see in the test above that we generate 5 mutations for the + operator, but all of them are contained “inside” the 'Elixir.Darwin.Mutators.Default.OpAddMutator':'__do_mutate__'/4 function. The arguments to the function are the following:

the module name of the module we’re mutating
the mutation count when this mutation was inserted; this is useful because it allows us to do something like: corrected_index = mutation_index - mutation_count, and match the corrected_index inside the case statement. That means we don’t have to create new case statements! We just have to call the function with the correct mutation count
the left operand for the + operator
the right operand

I have defined functions that mutate all arithmetic and comparison operators. It’s trivial for users to add new custom mutators or mutate the abstract code with a completely custom list of mutators. A mutator is a behaviour with a mutate() callback (well, I haven’t actually defined the behaviour as such, but that’s the idea).

I now need to implement the infrastructure to traverse the erlang abstract code looking for places to mutate. As things stand, I’m simply ignoring nodes my mutators don’t understand. I should be looking for places where I could insert mutations among their children. Fortunately, although Erlang’s AST is a little more complex than Elixir’s, It’s quite simple, so implementing this doesn’t seem too much work.

tmbb · August 3, 2019, 5:10pm

Mutating Elixir vs Mutating Erlang

I’m commited to the approach that Darwin should mutate Erlang abstract code. This has the advantage that Erlang abstract code is much easier to work with (all macros have been expanded at this point) and it also means Darwin can be used for mutation testing of Erlang code.

The exavier project (by @dnlserrano) is already a good option for people interested in mutating Elixir directly. And it has an important advantage over Darwin, which is that it actually works right now.

Mutators

The design of Darwin is meant to be completely modular. There will be a couple predefined mutators, but users can write their own. A mutator will be a behaviour, which implements the following callback: mutate(AbstractCode.t(), Context.t(), list(mutator())) :: {:ok, {AbstractCode.t(), Context.t()}} | :error

To mutate a module, you pass a list of mutators. Darwin attempts the mutators in order, from first to last. If the mutator returns a tuple of the form {:ok, {ac, ctx}}, then the context is updated and the abstract code ac is returned. If the mutator returns an :error, Darwin attempts the following mutator and so on.

Mutators receive the list of mutators as an argument, so that they can recursively apply the mutators along the abstract code. Why going this way instead of implementing some kind of generic tree traversal algorithm? If I implemented a generic algorithm to traverse the Erlang abstract code, there would be no simple way to have the “parent node” control how the children are mutated. My way is more complex, but it gives each mutator absolute control on how to mutate children

Identifying Some Common Elixir Macros

Th or operator is a macro that expands into the following:

iex> ExToErl.elixir_source_to_erlang_abstract_code("a or b")
{:case, 1, {:var, 1, :_a@1},
 [
   {:clause, [generated: true, location: 1], [{:atom, 0, false}], [],
    [{:var, 1, :_b@1}]},
   {:clause, [generated: true, location: 1], [{:atom, 0, true}], [],
    [{:atom, 0, true}]},
   {:clause, [generated: true, location: 1], [{:var, 1, :__@1}], [],
    [
      {:call, 1, {:remote, 1, {:atom, 0, :erlang}, {:atom, 1, :error}},
       [{:tuple, 1, [{:atom, 0, :badbool}, {:atom, 0, :or}, {:var, 1, :__@1}]}]}
    ]}
 ]}

Or, using the string representation:

iex> ExToErl.elixir_source_to_erlang_source("a or b") |> IO.puts()
case _a@1 of
  false -> _b@1;
  true -> true;
  __@1 -> erlang:error({badbool, 'or', __@1})
end

:ok

If I implement an or mutator, it should recognize this Erlang code and mutate it in intelligent ways. It should not mutate the case statement as if it were a normal case statement… Pinging @josevalim - should I depend on the fact that the Elixir ocmpiler generates this code? I believe that nowadays the Erlang output is mostly stable…

tmbb · August 5, 2019, 1:15am

I can finally traverse the full Erlang AST (and mutate the parts I care about, of course). There are still many useful mutators, missing, but the skeleton for the mutation part is pretty much done. There is no integration with ExUnit yet, though.

I’m now testing the mutated code for the first time. Up until now, I’ve just tested the generated abstract code, but I’lve never tested the dynamic properties of the code which is basically answering the question: “Does this mutated code do what it should?”.

An example of Darwin traversing a tuple and mutating the not operator inside the tuple:

    test "tuple" do
      {abstract_code, ctx} = mutate_elixir("{a, not b, c}")
      # Assert that we generate the correct erlang code.
      assert Erlang.pprint(abstract_code) == """
             {_a@1,
              'Elixir.Darwin.Mutators.Default.OpStrictNotMutator':do_mutate('Elixir.MyModule',
                                                                            0, _b@1),
              _c@1}
             """

      # Assert the correct number of mutations is generated.
      assert Context.nr_of_mutations(ctx) == 3

In the code above, I use my ExToErl project to compile the Elixir expression into Erlang code. Darwin actually recognizes the macroexpanded code for the not operator (as well as the and and or operator) and mutates it appropriately.

Darwin can now:

Mutate literal lists, tuples and maps (there might be some bugs with map updates)
Mutate strings, charlists and atoms
Traverse the Erlang AST and recursively apply user-defined mutators to specific nodes

The following is still missing:

Mutators for the arithmetic operators (+, *, /, -)
Mutators for comparison operators (<, <=, ==, etc)
Mutators for bitshift operators (&&&, >>>, etc)
Mutators for integers, floats and booleans (I think we should treat booleans differently from other atoms)
Integration with ExUnit (rerunning the test suite with different mutations, reporting errors, gathering statistics, etc)