1. 9

  2. 14

    A property test may assert that, for arbitrary input, the code under test won’t crash, throw exceptions, have assertions fail, etc. This is the most fuzzer-like (fuzziest?) side of property testing.

    A property test might instead assert that, for some arbitrary series of calls to a stateful API, the return values along the way make sense and the end state is reasonable. It might check that, for any arbitrary ordering, a sequence of interactions in a distributed system won’t violate any invariants. This looks less like fuzzing, more like a model checker. Failures found by these tests usually have a much clearer narrative than those found by a fuzzer.

    Property-based testing overlaps with fuzzing, sure, but it’s not the same thing – each has different trade-offs, emphasizes different things. Property-based testing usually needs a bit more test-specific setup, but is usually better at shrinking bugs to minimal counter-examples (since input generation is better defined), and can focus generated input more to specific parts of the state space.

    Model checkers often have total coverage, and can test an abstract design rather than an implementation. Fuzzers usually make little or no assumptions about the input (and may be entirely directed by heuristics and coverage data). Property-based testing can fluidly switch styles per test. They’re all valuable, and understanding how they differ helps to apply each more effectively.

    1. 2

      It might check that, for any arbitrary ordering, a sequence of interactions in a distributed system won’t violate any invariants. This looks less like fuzzing, more like a model checker.

      If your language supports contracts, you can convert any invariant error into a crash so your fuzzer now does PBT, too! They’re definitely different things that have different purposes, but they share enough in common that both can learn a few tricks from each other.

      1. 8

        not necessarily any invariant, even though it could cover many.

        For example, a property-test can use a model-based approach where you could model the values of state machines on 3 different computers as a local data structure and take guided-random steps through operations. The model you have and use to validate the output of the system is able to take into consideration the perceived state of the entire system as an external observer.

        For example, if you have 2 books for sale, sell both, and the model considers 2/3 nodes to each have one book, but the local result from one of the nodes mentions it doesn’t have it (or another one has too many), the local invariants of each machine may individually be respected, but the system-wide one broken.

        Property-based testing could detect such a case, but unless your language’s design by contract can handle such a global view, you’ll have trouble dealing with it. You’d get closer to it if what you fuzzed was a program that did model checking of the system under test, but then you’re making quite specific harnesses and getting pretty close to what property-based testing does already. I’d say you’re not necessarily using the tool the way it was intended, whereas that use case is a fairly direct one in Erlang’s quickcheck (and related open source/free implementations)

    2. 6

      I was describing PBT to friends and coworkers as “contracts plus fuzzing” until I realized that most of them weren’t familiar with contracts or fuzzing.

      1. 4

        My recollection is that Quickcheck has code to generate minimalist test cases from the input that goes awry, which is a cool feature compared to simply throwing random data at a function.

        1. 3

          QuickCheck generates from the types of the inputs to a function. Fuzzing is for anything that takes user input… so I guess it’s really any function taking that string of bytes.

          1. 5

            Ok, but once QuickCheck finds a problem, it tries to generate an example that’s as small as possible, which is kind of cool:


            1. 5

              Haskell’s quickcheck does that. The Erlang variants use combinators that you can write and compose, and let you guide the distribution of inputs you want to have rather than just taking ‘types’ in there.

              You can then, for example, decide that rather than sending any string, you’re going to take strings that contain 20% emoji, 5% ASCII, 10% sequences that include combining characters, and the rest is taken in linebreaks, escape sequences, and quotes.

              This turns out to give you an approach that while definitely reminiscent of fuzzing, sits a bit closer to regular tests in terms of how you approach system design (you can even TDD with properties), whereas I’m more familiar with traditional fuzzers being used as a means of validation.

              1. 3

                QuickCheck generates from the types of the inputs to a function

                QuickCheck can generate input based on the types: it has a typeclass called Arbitrary, which provides an arbitrary function that we can think of as a “generic” random generator for those types which implement it (this typeclass is also where shrink is defined).

                We can also write completely standalone generators when we want something more specific, like evenGen :: Gen Int which only generates even Ints, and we can use these in properties using the forAll function, e.g. forAll evenGen myProperty.

                There are a two other things to consider as well:

                • Properties can have preconditions, which are implemented using rejection sampling. For example myProperty n = isEven n ==> foo will only evaluate foo if isEven n is True. If we generate an n which isn’t even, the test is skipped. If too many tests get skipped, QuickSpec tells us. We could achieve a similar thing with boolean logic, e.g. myProperty n = not (isEven n) || foo, but in this case we’re replacing skips with passes, which might give us false confidence in the results (e.g. we might get 100% of tests passing, but never actually generate an input which passes the precondition)

                • We can use newtype to give a different name and Arbitrary instance to existing types. QuickCheck comes with a NonEmpty alias for lists, NonZero aliases for numbers, etc. The important difference between using a newtype and using a normal function (like evenGen) is that we can ensure some invariant when shrinking: e.g. shrinking an even number shouldn’t spit give us odd numbers.

              2. 3

                So do many coverage-guided fuzzers, e.g. afl-tmin. Doesn’t really have much to do with the way the fault was discovered in the first place.

              3. 3

                You can do property-based testing with manual values. That’s how it started with tests that were manually derived from formal specifications. Whereas, fuzzing was traditionally throwing random data at software to see if it breaks. The recent tools have overlap between using properties and using randomness. So, the people doing fuzzing are currently calling guided things fuzzing when it was previously unguided. If anything, the definitions seem to be changing. Now, with new definitions, property-based testing might be a subset of fuzzing or use fuzzing in combination with its spec/property-oriented focus.

                Of course, with all the siloing of knowledge, I might have not encountered the original definitions of fuzzing and be mistaken. It was always random in stuff I read early on, though. They mostly focused on crashes, too, versus specific properties being violated. That was also beneficial given it found mistakes a focus on known properties missed. So, I always thought of property-based and true (random) fuzzing to be complementary.