How Best to Use TDD for Random Generation

There is some code that I’m still not good at building out with a process of Test-driven Development. One of those challenges is random content generators.

For example, I used to play a great BBS game called TradeWars. A lot of what made it fun was the way it generated interesting maps. Here’s an image of one:

Some neat features of what has been generated are:

  • The dense cluster of interconnected sectors 1-7
  • Rings like 52, 66, 76, 77, 67, and 53
  • Choke points like 153

I might choose to force some features into every map, like the dense center. Then it makes sense to test for them. But what about things like chokepoints and rings? It’s possible that they wouldn’t randomly form. I guess I could randomly generate maps up to some selected cutoff, until I either find one or give up, but that feels like a super annoying Selenium test that just times out sometimes due to dumb luck.

I could lock a random seed, but that feels like it’s not exercising the generator very much. What if there are some random events that cause it not to reach the configured number of desired sectors? I wouldn’t find bugs like that I only ever checked known good inputs.

My instincts are to build a visualizer and just look at what it spits out until I have what I like. However, I’m really interested in challenging myself to see if I could use TDD to construct the code without this cheat.

Any thoughts are appreciated. Thanks in advance!

3 Likes

Not an expert, but I think this might get easier if you switch from “a map has a 1/N chance of having X” to “at least 1 map out of N will have X”. So you’d generate a sequence of expected predicates (1 * “has X” + (N - 1) * “does not have X”) and then do a random shuffle. You’d need some state to keep track of when you need to generate a new sequence.

1 Like

The starting point of TDD is to write a test that fails. Then you fix it. Then you make the code nicer.

So what would be your failing test in this case? To generate a map and not see one of the random features you’re expecting? However, if they’re generated at random, there’s no way to know that they should appear when you run the test.

The only simple way I can think of to fix your expectations is to mock your random number generator. Think of a super-simplified example where you have a function that has to return “A” 10% of the times, and “B” 90% of the times. This function will generate a random number between 0 and 1 and produce “A” if the number is less than 0.1, “B” otherwise. By mocking the random number generator and forcing it to return (for example) 0.09 you can fix the expectations and write a failing test, which you then can fix.

You’re not stressing the random number generator in this way, true. You’re assuming it will work, but your whole application is built on that assumption, so it’s fine.

The right testing approach is probably finer-grained, focused on testing the pieces that work together to generate a whole map. For instance, that map looks like it’s generated with a Monte Carlo process similar to crystal growth - something like:

  • start with a few sectors connected together as a “seed”. That might be where the strongly-connected “union” cluster comes from
  • try adding a new sector in a spot that’s not already occupied, connecting it randomly to the other ones.
  • evaluate the “energy” of the new configuration
  • using the “energy”, decide whether or not to keep the added sector. In real-world crystallization, this corresponds to an atom/molecule sticking to the crystal or bouncing off
  • keep trying sectors until the map is full enough
  • advanced version: the decision based on “energy” typically has a parameter equivalent to “temperature” that affects the probabilities (higher “temperature” → higher “energy” is more likely), so it’s possible to simulate the equivalent of forming crystals as a liquid cools

Most of the above steps are deterministic, other than picking the random location and making the final decision to keep or reject the addition. In particular, the “energy” function is a pure function of the configuration - and it’s where most of the complexity lives. In the case of physical crystals, it’s where all the science lives!

One way to ensure larger structures are generated in this kind of model would be to consider bigger “chunks” as individual parts to be randomly added - compare crystallizing a protein versus a pure element.

A newer method (that TradeWars likely doesn’t use because it wasn’t invented yet) is called “wavefunction collapse”:

I found this paper which recommends a three pronged approach:

  1. Write deterministic assertions about the output
  2. Repeat tests over a lot of seeds (similar to property tests)
  3. Save and rerun previously failed seeds to prevent regressions

2 and 3 are pretty clever. I like how they lean into the randomness and just try to cover a bigger portion of possibilities with the tests.

1 is basically the exercise left to the implementor. I do like the idea of writing tests for all the things I know to be true. It doesn’t allow for things that might be present, of course. Maybe there’s a lesson there though. If you really want something in there, the algorithm should probably enforce it.