Hey there @tmbb! Thanks so much for engaging. It means a lot.
To reduce the need to recompile a module lots of times, I’ve also tried to do it in a different way by mutating the Erlang AST in a way that allowed me to toggle mutations on and off without recompiling the code
That’s very smart! I might give that a go if I find it can speed up mutation testing significantly, which I’m thinking it will. Great suggestion!
From a cursory reading of the code, it looks like you mutate all operators in the file at the same time instead of one by one. Is that so? Shouldn’t one mutate one operator at the time to get more precise results?
You’re absolutely right, I do mutate all in one go. That by itself can be seen as a limitation. It should be fairly easy to change it though. I’ve thought about that, and AFAIK mutation testing doesn’t say anything about the amount of mutations each mutant should have, but you’re right in that what I’m doing might make the output more verbose, and it might even impair the understanding of the change needed. I’ll definitely consider this change.
Some more context regarding this second question and answer:
This approach of mutating all in one go was a trade-off I felt I could get away with for testing the feasibility of this PoC.
See, I have this problem, which is right now I’m not running each test ... do
individually but instead I’m running the whole test module (e.g., HelloWorldTest
). This has one clear disadvantage, which I’ll explain below with an example:
defmodule HelloWorld do
def sum(a, b) do: a + b
def divide(a, b), do: div(a, b)
end
defmodule HelloWorldTest do
test "when testing sum" do
assert HelloWorld.sum(3, 0) == 3
end
test "when testing divide" do
assert HelloWorld.divide(5, 2) == 3
end
end
If I change code to the following:
defmodule HelloWorld do
def sum(a, b) do: a - b # changed from + to - via AOR1
def divide(a, b), do: div(a, b)
end
I will be running the tests for both tests, instead of just running the test for sum/2
(i.e., "when testing sum"
, which was the only one for which the corresponding source code changed). In order to try and maximise the amount of mutations I can catch with running the entire test module, I mutate all in one go. Does that make sense? Maybe it doesn’t…
AFAIU, finding out what tests I should run per source code change is hard. But I might not be seeing something very obvious. Let me know.
If you have some ideas on how to improve this aspect of exavier
, if you have a good heuristic or alternative, let me know as well @tmbb. Again, thank you for your kind comment. I also appreciate you challenging my design. 
Let’s make it better together! 