1. 49
  1.  

  2. 13

    Easily one of the best blog entries I’ve ever read and the title is perfect like a mnemonic, it makes it easy to remember the principle. Although the code examples are in Haskell the content is so well written that it becomes approachable no matter what your background is.

    1. 5

      Is this fundamentally the same idea as “make invalid states unrepresentable”?

      i.e. don’t have a corner case of your type where you could have a value which isn’t valid (like the empty list), instead tighten your type so that every possible value of the type is valid.

      Looked at this way, any time you have a gap (where your type is too loose and allows invalid states) the consumer of the value of that types needs to check them again. This burden is felt everywhere the type is used.

      If you do the check to ensure you haven’t fallen into the gap, take the opportunity to tighten the type to get rid of the gap. i.e. make the invalid state unrepresentable.

      All sounds good - but I wonder how practical this is in practice? In a mainstream language, this kind of work is often done in constructors. If I have a ‘User’ class with a ‘name’ string, and my User constructor requires a non-empty name, the rest of the code in the user class can assume it is non-empty (if the only way of making a User is via the ctor)

      Is that the same thing as what we’re doing here, or is there a material difference?

      1. 4

        I think it is related to “make invalid states unrepresentable”, but goes further to explain how to use types to achieve this. The crucial part is to understand the difference between validation and parsing as explained in the article. Applying it to your example validating a username to be non-empty is fine as long as the Constructor is the only way to construct that object, but parsing it would mean narrowing it to a different type, say a NonEmptyString. Or perhaps a Username type now makes sense. Passing that type around means you don’t need to re-validate your assumptions because now indeed you have made invalid states unrepresentable.

        1. 4

          parsing it would mean narrowing it to a different type, say a NonEmptyString. Or perhaps a Username type now makes sense. Passing that type around means you don’t need to re-validate your assumptions because now indeed you have made invalid states unrepresentable.

          As far as I understand, it is not practical with Python to apply that advice. Any ideas how to do it?

          1. 3

            Throw an exception in the constructor of Username if the passed-in string doesn’t meet the restrictions for a username.

            1. 2

              I was thinking today that it should be perfectly possible to apply the “parse, don’t validate” approach in Go, and I think it is a viable thing to do in Python too.

              If I understand correctly, the blog post is advocating to parse a logical representation of your domain model out of the primitive data that you get from other systems (strings, lists, json, etc). There is a benefit in pulling all that in a separate step that is independent of the real logic. This gets you several benefits:

              • Fewer bugs, as you do not work with primitive objects that can have different meanings and representations.
              • A central place to handle all your external data transformations that is easy to reason about.
              • No half-complete operations. Say a 10-write process completes 5 writes, and the 6-th before-write validation fails - how do you handle that?

              Doing all that in a language like Haskell that has an advanced type system is great, but I think we can do it in other languages even if we don’t have that 100% guarantee that the compiler is watching our back. For example, a parsed OO structure in a dynamic language composed of classes with a custom, domain-specific API is a lot better than the usual lists-in-dicts-in-more-lists-and-tons-of-strings spaghetti.

              1. 1

                I don’t have any experience with doing that in Python. You can technically subclass str, but failing in a constructor is not so nice and creating a new method to parse a string opens the door to construct the object with an invalid value.

        2. 1

          I thought this was going to be about a different, but vaguely related concept.

          When accepting untrusted input you should not validate it, and accept or reject the user input. You should instead parse and reserialize the input. This ensures that you will only have data that you wrote yourself. It is also natural to throw away unknown fields and simply any odd formatting.

          Stories with similar links:

          1. Parse, don’t validate via asthasr 1 year ago | 104 points | 36 comments