1. 78

  2. 32

    Agree. The types vs. tests dichotomy is quite harmful. You have to write tests when you use a statically-typed language. However, you have to write fewer of them and the consequences of errors or omissions in the test-writing process (we’re only human, after all) are generally lower.

    None of us who prefer static languages are saying “We don’t have to write tests because our tools are perfect.” Our tools are far from perfect. On the other hand, if you can have the compiler take on a significant chunk of the testing load, I fail to see how that’s not a good thing.

    1. 10

      The way I’ve started thinking about it is that in a dynamically-typed language you have to write more tests.

      It’s not that static typing reduces errors, it’s that dynamic typing makes even baseline things like ‘is this variable even in scope’ unverifiable statically.

      I have written many python tests that are simply sanity checkers. Make sure this function doesn’t crash on every run. Pylint catches some of these issues, but it still feels a bit silly.

      1. 4

        It’s not that static typing reduces errors, it’s that dynamic typing makes even baseline things like ‘is this variable even in scope’ unverifiable statically.

        Many dynamic languages provide the ability to modify the scope dynamically, but it’s not a property intrinsic to dynamic languages. ES6 JavaScript with the new module system is an example of a dynamic language where you have perfect static information about scoping.

      2. 8

        None of us who prefer static languages are saying…

        I do make the statement that once you’re using dependent types you don’t have to write “tests” for correctness reasons. You can just use values as types:

        example : 1 + 1 = 2
        example = Refl

        I think tests can still be useful, even if you’re using dependent types, but if someone is using dependent types and chooses not to write tests, I’d probably not claim they’re missing out on anything.

        1. [Comment removed by author]

          1. 6

            I don’t work with total languages with dependent-types because the cost is currently too high for the industries I work in. I know people who work in different environments (e.g. on seL4) using total languages and I understand some of their methods.

            What I do day to day is something called Fast and Loose Reasoning where I program in a total subset of the tool I’m using (usually Haskell, Scala or PureScript) and get benefits from using that total subset. I do try and write proofs instead of tests but sadly the tools I use often aren’t able to easily express the types needed to make proofs, so I do fall back to writing tests.

            So: I do avoid tests but currently I write lots of tests, because of cost. I expect this to change in the near future, once we figure out how to solve a few productivity problems (no pun intended) with total languages.

      3. 16

        I have a reference write-up on what was proven to improve software quality over a period of decades. Each of these is mandated in high-assurance security if aiming for perfection. Subsets can be used in real-world software.


        1. 3

          Wow, this is incredible. Thank you. I want to buy you a drink now.

        2. 13

          I think both the original article as well as this followup miss the point that type systems aren’t, entirely, about preventing errors- at least not in the same way that tests are about finding them. Tests should be about checking assumptions, making sure that what you wrote matches up with what you intended to write, and provide a mechanism to check that your assumptions still hold under changes to other parts of the system.

          In my opinion the value of tests is that they provide an abstraction over the type of data you can be working with. Sure, the type checker will provide errors when you make a mistake, but a large part of the value is simply in allowing or forcing you to think about your data algebraically or categorically. The benefit to code quality is that people make fewer mistakes when they have more power and flexibility for how they are expressing the code. When people are complaining about the difficulty of working with strong statically typed languages it seems that it’s often because they haven’t learned to exercise that style of thinking to the point where they see it’s benefits to expressiveness.

          1. 4

            I’m interested in this, but I’m not sure I entirely follow. I get:

            1. Type systems aren’t (entirely) about preventing errors.
            2. Tests are about finding errors, checking assumptions, maintaining invariants.
            3. Part of the value of types is allowing you to think algebraically/categorically.

            I think that (3) is a follow-on from (1). Can you elaborate? I’d love to hear more.

            1. 6

              I’m actually in the early stages of writing a book about this; I’ll try to distill my thoughts down to something makes a little more sense here. For a bit of perspective, the languages I’ve worked with most in my career are: C, Haskell, and Ruby. Much of my thinking has evolved out of comparing the general approaches that I’ve used in those three languages, and trying to generalize them to the wider practice of programming.

              To start at a very high level, I think of the Curry-Howard Isomorphism, which basically says that a type signature for a program is a formula that is proved by it’s implementation. In that sense, when we’re working with a strongly typed language, our entire program is a composition of many tiny proofs that we develop with the help of our theorem prover (in our case, the type checker). If take a category theoretical approach to program analysis we can see that each of these proofs can have several different implementations that may or may not be equivalent for our purposes. For example (*) and (+) both prove (Int -> Int -> Int) but we can’t use them interchangeably in our application.

              When we start looking at developing applications this way, it leads us to consider an approach where we start with some set of formulae that we want to implement as programs, which are themselves pure functions. Our tests then are acting much more like function-level acceptance tests than unit tests, and our focus is not on the proof of our formula but rather on demonstrating that we have selected the correct morphism.

              Moving back into the “Not about preventing errors” part of this though, taking a category-theoretical approach to software design also means that, beyond the area of proofs for our applications, we have an entirely new toolbox to use when we’re thinking about how we design our applications. I typically use the language of algebra instead of category theory because it’s often sufficient and less intimidating. If we start to look at our application as a type level formula then the process of proving it with our program becomes one of defining an algebraic structure that’s appropriate for our data domain, and then building up the set of morphisms we have over the objects in that structure, and finally using the algebra we’ve defined to prove our formula.

              I believe that the benefits of this approach extend beyond making our code less error prone, and improve the overall quality of our code under refactoring, the addition of new features, or changes to requirements. While traditional approaches to unit testing will give you the confidence to make changes without introducing regressions, the type-driving algebraic approach gives you a coherent language to work with and I think makes it much easier to write applications that are structured such that they facilitate modularity and refactoring. When you’re building an algebra it’s difficult NOT to build small orthogonal components that can be composed into different programs as your business requirements grow and change.

              1. 2

                Thank you very much!

                I can’t claim to fully appreciate your insight into types… I’ve tackled the Curry-Howard Isomorphism and Category Theory many times now, and every time my brain just fuzzes and I only get the most general gist. So I’m still waiting for the breakthrough that will really allow me to apply that whole side of things to programming. Maybe your book will be the thing!

          2. 11

            I read both blog posts, and I think there’s something that’s floating around the periphery, that I want to make explicit: bugs are a result of being unable to reason about your code. If you have a block of code that’s very simple, it’s easy to reason about- you can hold an entire model of that block of code in your head all at once, you can think about the different ways it gets invoked and what the results will be.

            Tests codify our reasoning. They’re a tool to help us reason, for when we have a complicated code block that we can’t reason effectively about, we can write tests that interact with it.

            Types aid our reasoning. They’re a way of making behavior explicit- a variable of type string will always and forever hold a string. I don’t need to think about the other things it might hold- it holds a string.

            Functions, modules, classes, structs, etc. are all tools that we use to make our code easier to reason about. These abstractions allow us to avoid some of the details, and it makes it easier for us to hold a whole model of our code in our heads at once. Haskell’s monads for hiding side effects, languages that favor immutable objects, or composability, and all of these other abstractions, they’re all ways to make understanding your code easier.

            1. 4

              Bugs are often because of problems with reasoning about interactions of different pieces of code, or of code and its environment.

              In this terms, tests work at defined interface points, they can be made more illustrative but they do not cover everything; types can work throughout code and cover all instances of some specific rule, but they have to be expressed in lower-level terms than tests most of the time.

              1. 3

                bugs are a result of being unable to reason about your code

                Depends on what you mean by “bugs”, no? Even if I can fully reason about my code, it may still be buggy if I (or my team or my prod mgr or my customer) misunderstand the desired behavior. For example:

                if invoiceAgeInDays > 30 then overdue := true

                I can fully reason about this, there are no side-effects, but it’s a bug: it’s supposed to be >= 30. A test that used an appropriate example would have demonstrated that it was wrong.

                However, even if the code was correct as demonstrated by my test, what if the state regulations were misinterpreted and it actually should have been >= 28 days (or, more likely 1 month – date math is hard)? That’s still a mistake (bug, error, whatever). No amount of types (as much as I prefer them), and tests will help when we misunderstand things.