What you’re describing here sounds an awful lot like property based testing. More specifically it sounds like model checking. I’m not sure if you have familiarity with those techniques or not. But setting that aside here are the questions that come to mind reading your example and description:
- You talk about “fully covering all possible input data sets”. I’m curious how you’re realistically going to do this. Most generative testing approaches get you “close enough” since its virtually impossible for them to explore the entire set of possible values for a sufficiently complex problem. There are approaches in formal methods that can do this but even those tend to be computationally expensive. I’m not sure if this is marketing speak or a real claim but if its real I find it somewhat dubious (or maybe cautiously optimistic).
- Something that’s implied but not stated is that if you find a failing test case you’ll be able to return a minimally viable counter example. I believe that attempting to save intermediate states and handle branching logic might interfere with your ability to “shrink” your counter examples to sufficiently useful sizes.
- My intuition is that the claim “runs faster than handwritten tests” just isn’t realistic and is probably a non-goal for this sort of testing. People do all sorts of cheats, such as using mocks, when writing tests to make the test run fast. The tradeoff for this gain in speed is that these sorts of tests are inherently useless for proving correctness.
- “Automatic edge case detection: generate event properties from dynamic domains (see below).” - I guess I’m missing this from your example but I’m not sure that I’m seeing how you’re going to explore the state space efficiently. Guiding tests and stateful models to efficiently explore the state space for faults is a fairly active area of research right now. It’s generally regarded as a Hard Problem . One of the more recent additions to PropEr was simulated annealing which tries to help guide generators towards faults. There are other ideas such as Peter Alvaro’s Lineage-Driven Fault Injection (my personal preference atm.) or the system described in the Beginners Luck paper. Currently the best results occur from manually tweaking generators to guide the tests towards places where we suspect faults may be. As an aside this is also why trying to generate test data from types is a bad idea. You need more control over the generators if you hope to find faults in your system.
That’s my feedback based on what you’ve written up. Currently I use PropEr (via. the propcheck) library both at work and in my OSS work. IMO PropErs generating, shrinking, and ability to do stateful testing is currently best in class in elixir if you can’t pay for EQC (EQC is truly in a league of its own). Any other offering would have to be very compelling to cause us to switch.