Property Based Testing: Describing expected behaviour in terms of properties

March 19, 2018 by guidj

In my daily work, I often need to write data pipelines to produce metrics, create datasets for machine learning models, or just clean up logs. I had usually done testing the traditional way, verifying my code does what I expect by checking for normal and edge cases I could think of. But over the past couple of months, I started using property based testing, and I feel like my code quality has improved dramatically.

In a nutshell, property based testing is about verifying code by checking that certain properties are valid. But what does that mean?. Let’s take a look at a quick example. Say you have a function that reverses a string:

def reverse(msg: String): Int = {
    msg.foldLeft("")((acc, next) => next + acc)
}

Now, we could test this function by providing some simple use cases, like jen => nej and ah => ha. While this is a trivial problem, in practice there can be hidden design choices in our code that create bugs that we won’t notice or test.

I won’t dwell much into what property based testing is, but I highly recommend the reader find a good tutorial or description of it in your favorite language. What I will do it is talk about the benefits by pointing out three problems I have identified in my own code, due to my own clumsiness, by using property based tests.

For one, property based tests help us test edge cases of data types by automatically running tests for them. For instance, in a function that takes a String as a parameter, a property test runner will execute n tests to cover several possible use cases of properties for strings, such as null string, empty string, really long strings, single character strings. In this case, length and being defined are the properties the test is observing. Second, property based testing frameworks usually run several different tests each time to increase the probability of finding problems we have yet identified. In our string reverse function, for instance, a good property to test for would be: if we feed the method a non-empty string, the result should be a non-empty string of the same length. Another, slightly more complex property could be: this non-empty string should have the same character distribution as the input. By assuring this property, we ensure the behavior of the function is consistent with our expectations.

Case One: HyperLogLog Counter HLL are approximate data structures used for estimating the count of unique things. I was using one to count the number of unique instances of an event. . But, there was an issue.

Since an HLL (from Algebird) has to be initialized, I was initializing them an empty byte array before the count began, thinking that it would be ignored. In all of my tests, I always had at least one event to count for each HLL, and so all of my tests passed. After all, I wanted to see the counter could count something. When I switched to a property-based test, it fed empty lists of events to the HLL, and when it computed the estimate, it returned a value above zero. This scenario was in fact practical because I had one HLL for each fixed bucket of time, and so if there were no events at a given time bucket, it would be empty. Ans so, the HLL should not have been initialized to an empty byte array, but to the first observed event. Lesson learned. Better double check implementation details of ADTs before using them. I refactored the code, and it worked nicely.

Case Two: Reduce The second problem was related to the first. When merging the HLLs, I fed an iterable of them to a reduce function. Technically, reduce can only be performed on non-empty sequences. The issue was once again caught by the property based test feeding empty sequences to the reduce operator, which failed. Consider, for instance, that two HLLs from two time buckets with no events were being merged together. Since both were empty, they would be undefined and the iterable would be empty. This meant something had to change to reflect this reality. Either the reduce operation would have to return an Option[ReducedADT] or we should modify the input type to a strict non-empty sequence. I went with the former, but the latter would have worked just as well. Scala having a rich type system, expressing these constructs can be fairly easy.

Case Three: Path parser I had written a parser that extracts a path from an input string. I was looking for two elements in it: (1) a prefix that described a container and (2) an object inside the container. I had naively assumed the presence of a prefix would always imply a non-empty container string. Needless to say, I was wrong. Parsing both as optional types, I was getting values of Some("") because the prefix wasn't followed by a container name. The test quickly identified this, a I had to make changes. Code corrected, functions behaving as expected.

There are other problems I encountered in my code by using property based tests, mostly due to oversights here and there from my part. These tests helped me think about what goes into my functions and what comes out. I don’t find them useful for every single use case, but where they can be useful, they can bring immense value. They allow me iterate through code more confidently by describing what it should do in terms of its properties. Try them out if you haven’t, and see for yourself!