Questions about Property Testing / Stream Data

josevalim · October 19, 2017, 10:01am

Hello folks,

There has been some doubts regarding StreamData and PropertyTesting in Elixir so we have decided to open up a thread to answer common questions we have been asked in person and seen around.

What is property-based testing?

Generally when we write tests, we write example-based tests. We need to come up with values when writing our test cases:

assert String.contains?("foobar", "foo")

The limitation of example-based testing is that they are entirely dependent on us in coming up with corner cases and we often make mistakes or fail to see important corner cases. With property-based testing, we define properties and let those properties generate random data for our tests:

check all left <- string(),
          right <- string() do
  assert String.contains?(left <> right, left)
  assert String.contains?(left <> right, right)
end

Now every time you run the property, 100 examples will be generated. Common corner cases, such as "", will be tested frequently and help you find bugs in your code. The tricky part behind property-based testing is to find the properties we want our code to hold. Once a property is found, we can use those properties to complement our example-based tests.

At ElixirConf US 2017, we have announced that a property testing library will be part of Elixir v1.6. Our goal with this post is not to answer the technical questions behind StreamData but rather explain why it is being added to the language. For more information on property testing per se, the first three chapters of Fred’s book is a great starting point. To learn more about StreamData itself, see its announcement.

Why the core team decided to add Property Testing to Elixir?

There are usually two reasons why something is added to Elixir:

We need it for building Elixir itself
We believe it is an important concept/feature for the community

Property testing fits both.

For example, we had inconsistencies in Elixir’s standard library that would not exist if we had properties when implementing those functions. In Elixir v1.1 we deprecated String.contains?/2 with an empty string as a pattern, such as String.contains?(string, "") because we were unsure of how it should behave. Then we added it back on Elixir v1.2 because @ThomasArts showed us a property that revealed String.contains?/2 should return true for empty strings:

check all left <- string(),
          right <- string() do
  assert String.contains?(left <> right, left)
  assert String.contains?(left <> right, right)
end

Now imagine that right is "", then we get that:

  assert String.contains?(left <> "", left)
  assert String.contains?(left <> "", "")

If we had used properties since day one, we would have avoided this back and forth on the Elixir API. Then it became clear to us that property-based tests would not only help us find bugs in your code but also improve the design of our APIs. It can help us and the whole community write better software.

Isn’t adding property testing to Elixir going to make it harder to learn?

Yes and no.

We should not expect all developers to learn property testing on their first day on the job. But, by adding it to the language, we are saying that if you want to be a proficient Elixir developer, then you should eventually learn property-based testing. We believe this important because we strongly believe you will write better software if you have property-based testing in your toolbox.

Learning a new programming language and its ecosystem is a journey and we care a lot about this journey. We are making this journey a bit longer but the extra miles will be worth it.

We also understand there is a limited amount of features we can add to the language before making the journey too long or the language too big. Adding something now means not including something else later. As an exercise, let’s see a counter example of when we didn’t add something to the language: GenStage.

GenStage is a solution to a particular problem: interfacing with external systems. We don’t need it to build Elixir itself and we don’t believe all developers need to know GenStage unless they are facing the particular problem GenStage is meant to address. It is a tool you reach for. In fact, we even made GenStage less necessary in our day to day work by adding parallel processing of collections directly to Elixir with a single function called Task.async_stream/2.

Why have our own implementation of property testing instead of using an existing implementation?

The main reasons are:

Since we want to bundle it as part of Elixir, the code should be open source with an appropriate license
We wanted to add both data generation and property testing to Elixir. That’s why the library is called stream_data instead of something named after property tests. The goal is to reduce the learning curve behind property testing by exposing the data generation aspect as streams, which is a known construct to most Elixir developers. We had this approach in mind for a while and the first library we saw leveraging this in practice was @pragdave’s pollution
Finally, since the core team are taking the responsibility of maintaining property testing as part of Elixir for potentially the rest of our lives, we want to have full understanding of every single line of code. This is non-negotiable as it guarantees we can continue to consistently improve the code as we move forward

We understand rolling our own implementation has its downsides, especially since it lacks maturity compared to alternatives, but we balance it by actively seeking input from knowledgeable folks and by listening to the feedback that comes from the community, which we are very thankful for.

Finally, it is also important to add that Stream Data does not fully replace existing solutions. The first version of Stream Data provides only stateless properties. Other property testing libraries also include stateful testing. QuickCheck comes with even more advanced features such as a randomizing scheduler for the Erlang VM called Pulse which makes it great for finding race conditions in concurrent code.

Our hope is that property-based testing in Elixir also works as a stepping stone for developers looking for more complete solutions.

Your turn

I hope this initial discussion provides some insight of why stream data / property testing is being added to Elixir. It certainly was not a decision done on a whim nor it is an attempt of the Elixir team to chase buzzwords. It has been an area of interest for a while and we are glad we are now finally able to work towards its inclusion on Elixir v1.6.

if you have questions, please let us hear them.

ThomasArts · October 19, 2017, 11:43am

Property-based testing

(Note that I call it property-based testing, rather than property testing, because it’s not the property you test, but you base your testing on properties. Moreover, that is the term we used when we invented this way of testing and I like to stick to that origin. Call it whatever you like.)

As @josevalim states in this post, there is all reason to assume that eventually you write better software if you use property-based testing. With many years experience, I can state that it has changed my way of developing software. If I cannot think of, at least, a simple property that should hold for my code, then I cannot even start writing that code.

We also experienced that it is harder for people to think in terms of general statements of their programs than to think in terms of examples. So my advise is to start with that example and then to say, why 20, why not 40 or 80? Generalize to arbitrary 20-faults

     let n <- nat() do
          n * 20
     end

and use these random values as base of your tests. Start simple and extend.

josevalim · October 19, 2017, 12:10pm

That’s nice to know! I do like “property-based testing” because it is a direct contrast to “example-based testing”, although I am guilty of using both terms interchangeably. We will make sure to use “property-based testing” on all documentation from now on.

JEG2 · October 19, 2017, 2:48pm

Thanks for sharing these criteria.

I assume this post is at least in part a reaction to the latest Elixir Fountain episode. When I was listening to that episode, I found myself wondering, “Why include a property-based testing library, but not a JSON parser?” (JSON parser is just a random example of a frequent need I find that I have.)

I don’t tell you this to question your judgement. I’m saying that hearing your reasoning helps people like me work out how such decisions are made. It also helps us if we’re going to lobby for the inclusion of some item in the future, because we know a little about what it would take to convince you.

That said, I do have a question related this addition. Does this new library indicate a blessing of the core team to embrace the check all syntax “trick” for handling variable arguments? I call it a trick, because I had to play around with the code to figure out that all serves as a non-existent function call in order to “trick” Elixir into building the AST of property assignments. This surprised me a little.

I had assumed Elixir’s preferred construct for grouping expressions was a block. I guess I got this impression primarily from Kernel.SpecialForms.try/1. Although I guess Kernel.SpecialForms.with/1 is closer to this case. Anyway, I guess I had expected we would solve a problem like this with code closer to:

given do
  left <- string(),
  right <- string()
after  # I wanted `then` but I guess it's not legal:  https://hexdocs.pm/elixir/syntax-reference.html#content
  assert String.contains?(left <> right, left)
  assert String.contains?(left <> right, right)
end

Anyway, this is just a curiosity of mine. Thanks for your time!

AstonJ · October 19, 2017, 2:50pm

I love this. While people may come for the speed/scalability of Elixir, they may well stay because it not only makes easy, but actually encourages you into using decent coding principles that help you build reliable resilient systems

I know I’ve said it a million times before but this aspect of Elixir is sooooo appealing to me!

redrapids · October 19, 2017, 3:04pm

I think for this case I am in the camp of “the ends justify the means”. The reason is that if property tests are going to be pervasive, the syntax needs to be beautiful and accessible. I really like that syntax; it’s very simple and direct. The generators make sense as arguments for me.

I think explaining this feature as if it were a special form is just fine,

But I definitely appreciate the thought that these kinds of hacks should be used with extreme care.

josevalim · October 19, 2017, 3:49pm

Thanks @JEG2, those are all excellent points.

I don’t believe it is a problem to question my judgement. There is a thought process and you can disagree with the outcome of the thought process or even with the thought process itself.

I would, however, be unhappy if the main assumption is that there is little to no thought process. Part of this unhappiness is with myself, because it means I did not communicate as well as I should have, but also partially with the assumption in itself, since at this point I hope we have shown that we do better than that.

I would like to add a small footnote related to when we add things to Elixir: we also need to integrate the feature with the language “naturally”. For example, the Decimal package is extremely important for Elixir developers but we are unable to integrate it in the language in any meaningful way. If the is_number/1 guard cannot return true for decimals, adding it to the language will rather make things inconsistent. That’s why the focus on stream data is important, because that’s our integration point.

JSON, CSV and similar are necessities rather than concepts that are important for Elixir developers. URI is another necessity but one we needed in Elixir/Mix.

I am a quite partial to this trick since I mentioned it in the first ElixirConf, even before 1.0, to support constructs such as stream for and parallel for in the language. The goal of check all is to mirror with and for and that’s the best we can do without variable args.

I am not a big fan of the do/end approach because, although we use do/end blocks for grouping expressions, those expressions are typically executed sequentially unless there is a non local return. Generators would look fine inside do/end but filters would look too loose. For example:

given do
  left <- string()
  left != ""
  right <- string() 
  if right != left do
    ...
  end
after
  ...
end

AFAIK there is also nowhere in Elixir where we traverse the expressions in a block and rewrite them, which this approach would effectively require. But at the end of the day I concede most of it boils down to personal taste.

tmbb · October 19, 2017, 4:15pm

I think you’ve explained it very well why stream_data is set to be included in Elixir. I think that including something like this in Elixir might discourage developers from trying different approaches (which is bad), but this disadvantage is dwarfed by the fact that you can use it to test Elixir. My only worry is that when stream_data is finally officially shipped with Elixir it might start moving “too slow”, but if anyone disagrees, they can just fork stream_data and publish their own version, so no harm done.

For example, I would like to play with Hypothesis’ bytestream-fuzzing approach and maybe release something based on that, but I support the inclusion of stream_data if it is useful to test the Elixir codebase.

OvermindDL1 · October 19, 2017, 4:18pm

Yeah this is the one part of stream_data that I’d change. I use stream_data a lot but honestly it would flow a lot better for me if I could do things like:

check do
  integer_stream <- integers() # grab an integer
  integer_stream >= 0 # skip on a boolean false
  someFuncCall(integer_stream) # Fail if this exceptions or returns false, continue otherwise
  string_stream <- string() # grab a string
  anotherFuncCall(integer_stream, string_stream) = Integer.to_string(integer_stream) <> string_stream) # fail if no 
match
  ...etc...other...tests...
end

Could even prefix things with assert and such too. But such multi-stage building would make some of my tests more clear as well.

Plus no hack of multi-arguments (I am really really anti-dynamic-arguments apparently, it just does not feel right on the BEAM at all, blocks are better for such things, like they would be for for and with too, but that is not this issue ^.^;)

But yes, stream_data is awesome other than that one ugly syntax quirk, I use it quite a bit now instead of the others.
Main ‘feature’ it really needs in my opinion is state testing and reduction now like some others have, that way we can test full-on processes instead of just simple functions.

It ‘could’ if the Decimal module was not a weird Struct type and instead was a tagged tuple, those can be tested entirely in-guards.

Yeah I still really really hate variable args. If you need a variable args in something then pass in a list or tuple, just wrapping a macro around a pre-call or baking it into the language makes it look really out of place and unnatural. See my comp for replacement for example here:

  iex> comp do
  ...>   x <- list [1, 2, 3]
  ...>   x
  ...> end
  [1, 2, 3]
  iex> comp do
  ...>   x <- list [1, 2, 3]
  ...>   x * 2
  ...> end
  [2, 4, 6]
  iex> l = [1, 2, 3]
  iex> comp do
  ...>   x <- list [1, 2, 3]
  ...>   y <- list l
  ...>   x * y
  ...> end
  [1, 2, 3, 2, 4, 6, 3, 6, 9]

It expands to code that is a lot more efficient than for as well in benchmarks (Elixir’s for is 1.34x slower on average) while supporting more types and a combination of types (combine list and binary comprehensions for example), but the main point is I find it a lot more readable as it is in a block instead of having comma-droppings all over the place with very weird alignment performed (the ‘type decorators’ of list and map and such could be left out, but having those let me generates more efficient code, you could always go with a generic Access though too for a slight speed hit, about on par with Elixir’s for then, but still with a more readable and extendable syntax).

But stream_data’s check all has a similar issue, it should not be a variable-arity-like call but should be a block, blocks are awesome and more readable (they would be awesome in erlang, even OCaml has blocks! ^.^).

josevalim:

I am not a big fan of the do/end approach because, although we use do/end blocks for grouping expressions, those expressions are typically executed sequentially unless there is a non local return. Generators would look fine inside do/end but filters would look too loose. For example:
given do
  left &lt;- string()
  left != ""
  right &lt;- string() 
  if right != left do
    ...
  end
after
  ...
end

Yeah that style I would not like especially… The expressions and generators should be mixed, like how I do with my comp as it seems to ‘flow’ down the block properly in the right order instead of relying on a variable-argument expression do the same (when you should not normally rely on the order of argument expansion).

Eh, just my normal bits about variable-argument functions should NOT exist, even conceptually, on the BEAM (use lists). ^.^;

But still, love that stream_data is being added, hopefully it finishes up state testing features that other property testing frameworks have (never know, it could find bugs in one of the Task or Actor’s or so ^.^).

tmbb · October 19, 2017, 4:27pm

You do pretty much the same in ExSpirit (defrule) and in the Expression Template-based rewrite will add variable arguments for pretty much everything (alt([p1, p2, p3]) to alt(p1, p2, p3) and etc.). What’s the difference here?

They are mixed already… What’s the difference?

OvermindDL1 · October 19, 2017, 4:31pm

I still quite prefer it with lists, in my own code even when testing the new format I’d still use lists, I just know that normal PEG parsers would not usually take a list so it feels unnatural in those cases. ^.^

Currently check all is not mixed, the generators and filters are pre-defined then there is a body block. In some cases it can make sense to mix them like if you only need one generator sometimes unless certain conditions are met that warrant the next, that way it saves processing of always acquiring the second even when not always used.

OvermindDL1 · October 19, 2017, 4:35pm

Also a new article on why property based testing is awesome (popped up on my phone this morning): https://medium.com/@PolySync/how-i-learned-to-love-property-based-testing-62dce4fe6e8e

tmbb · October 19, 2017, 5:04pm

Currently check all is not mixed, the generators and filters are pre-defined then there is a body block. In some cases it can make sense to mix them like if you only need one generator sometimes unless certain conditions are met that warrant the next, that way it saves processing of always acquiring the second even when not always used.

I thought you were referring to the “block version” outlined above (the one that starts with given). But in Macro-land blocks and lists are basically interchangeable, so you can pick the ine you like.

dimitarvp · October 19, 2017, 5:04pm

I agree with that worry. Plus language developers aren’t gods, they are motivated volunteers and they have personal lives – increasing the maintenance cost all the time would only burn them out.

@josevalim Maybe when property-based testing makes its way to Elixir, a generic contract module (behaviour / protocol) should also be included? To allow people to plug their own (or 3rd party) implementations of property-based testing? Or maybe allow that via Mix configuration? Not sure, just throwing an idea that might help people who aren’t satisfied with the speed of the future Elixir-bultin property-based testing solution – or people who have very specific needs.

ericmj · October 19, 2017, 5:11pm

You can already run your own implementations of property-based testing. That’s what all existing libraries do today.

tmbb · October 19, 2017, 5:12pm

I don’t think you need any fancy contracts for property testing. Just write a couple of macros and you’re done.

If you don’t like macros youvcan even go with functions and keep basically the dame API

dimitarvp · October 19, 2017, 5:23pm

I meant pluggable 3rd party implementations that replace the future builtin version but @tmbb is completely right by saying that everybody can bridge their own through macros – and it’s pretty easy / low-effort, too.

Thank you guys.

josevalim · October 19, 2017, 6:30pm

In my experience having something in the standard library does not necessarily discourage others from trying different approaches. Case in point: just see @OvermindDL1’s post right below yours about him replacing Elixir’s comprehensions (and a thousand other things).

It may even be the opposite: more people are attracted to it, which leads to more experimentation.

Right but even if we used tuples (or if map access was allowed in guards), “overriding” is_number and the usual mathematical operators to work with decimals may negatively impact overall performance. So is_number was just an example, I believe the problem as a whole goes a bit deeper. I will be quite happy to discuss this but then please start another thread so we don’t go further off-topic on this one.

tmbb · October 19, 2017, 6:54pm

I hadn’t thought of it, and you’re probably right. I mean, I’ll experiment for sure with property based testing even when stream_data is part of the standard library

Nezteb · October 19, 2017, 8:41pm

You mention QuickCheck. Are Elixir’s new property-based testing capabilities intended to be an alternative to QuickCheck? I know we could still use QuickCheck regardless, but is there a reason for adding these features when QuickCheck exists?

To clarify, I love the idea of these features being added!