1. 7

  2. 1

    The example of introducing Normal and Overflow types, with associated Arbitrary instances, etc. really made me cringe. Looks like it congealed in multiple stages, without subsequent refactoring to think “is this still a sensible way to be doing it?”; it could also have come from dogma (e.g. “everything should be a type”). Keep in mind that we can use forAll to specify the generator to use (and likewise for shrinking), so we don’t need a custom implementation of Arbitrary; hence we don’t need a custom datatype for each test.

    I’m not convinced that “one property to rule them all” is a particularly good idea. One of the nice things about testing is that it can be decomposed into small, easily understood chunks; indeed that’s one of the main reasons testing is so effective (implementations have to solve all requirements at once, which is hard; tests can check each one separately, which is easier).

    I remember being asked for advice about how someone could use QuickCheck for a unification function they’d implemented. They were struggling to think of a property which would capture the essence of unification. I told them to instead try decomposing it into a set of properties, each of which is necessary for correctness, but which don’t have to be sufficient. With that mental block removed, we could rattle off a whole bunch of properties in a few minutes:

    • Commutativity (unify x y == unify y x)
    • Free variables as left and right identities (unify (Var x) y == Just y and unify x (Var y) == Just x)
    • Reflexivity (unify x x == Just x)
    • Unification of constants as equality (isJust (unify (Const x) (Const y)) == (x == y))
    • Unification of subexpressions (isNothing (unify l1 l2) || isNothing (unify r1 r2) || isJust (unify (App l1 r1) (App l2 r2))
    • etc.

    Hughes is right that it might become difficult to reason about coverage when we’re dealing with multiple properties, but it’s also important to identify what we consider “good enough”. I assume that developers (including me) are lazy but not malicious, so I write tests to prevent easy-but-wrong implementations; they’re “good enough” when a correct implementation is also the most obvious, lazy way to pass the tests. I don’t worry too much about complicated ways that the tests might be subverted, if that would be harder than just doing it correctly. I’ve found this level of reasoning to be pretty manageable.

    I’ve known about QuickCheck labels for a while, but not really used them before. I think Hughes’s argument was convincing, so I might try using coverage conditions the next time I’m writing QuickCheck tests :)

    1. 1

      This was an awesome talk!