1. 16
    1. 5

      Much can be said on this topic. First, an excerpt from Test-Driven Development By Example:

      My first experience of automated tests was having a set of long-running, overnight, GUI-based tests (you know, record the keystrokes and mouse events and play them back) for a debugger I was working on. (Hi Jothy, hi John!) Every morning when I came in, there would be a neat stack of paper on my chair describing last night’s test runs. (Hi Al!) On good days there would be a single sheet summarizing that nothing broke. On bad days there would be many, many sheets, one for each broken test. I began to dread days when I saw a pile of paper on my chair. I took two lessons from this experience. First, make the tests so fast to run that I can run them myself, and run them often. That way I can catch errors before anyone else sees them, and I don’t have to dread coming in in the morning. Second, I noticed after a while that a huge stack of paper didn’t usually mean a huge list of problems. More often it meant that one test had broken early, leaving the system in an unpredictable state for the next test. We tried to get around this problem by starting and stopping the system between each test, but it took too long, which taught me another lesson about seeking tests at a smaller scale than the whole application. But the main lesson I took was that tests should be able to ignore one another completely. If I had one test broken, I wanted one problem. If I had two tests broken, I wanted two problems. One convenient implication of isolated tests is that the tests are order independent. If I want to grab a subset of tests and run them, then I can do so without worrying that a test will break now because a prerequisite test is gone.

      The tl;dr of this is that this anecdote does support that idea presented in this post, which is that the main idea of unit testing is that tests should be self-contained and isolated from one another, and there’s nothing about isolating code units from one another. There’s another later post from him which doubles down on this, where he says tests should be structure-insentive:

      Structure-insensitive — This can be a challenge for unit tests. Too much mocking, especially strict mocking, is a structure sensitivity nightmare.

      It’s worth noting that in TDD By Example, there is absolutely no mocking of any kind, and mocking as a practice came later. But, Kent Beck also provided this advice in a post about Smalltalk testing patterns:

      I recommend that developers write their own unit tests, one per class.

      Which is pretty hard to do in practice without some kind of test double. It also begs the question, aren’t there some classes which are private / implementation details? Do those need their own tests too?

      Anyway, this is another entry in a long saga of words being complicated, people ignoring history, and history ultimately being irrelevant because the colloquial definition of “unit test” has solidified. Ask almost any working developer what it means, and they will say that the unit is a function or class.

      I agree very much with the original definition for what it’s worth, and especially the recommendation in the post here, which is to focus on user behavior and allow the implementation details to be abstracted in the test. This naturally leads towards integration testing, since real user behavior always touches all parts of the stack.

      As a though experiment, picture moving your application from Rails to Node.js, or a server-rendered app to a single-page app. Or picture adding a complicated data-fetching and caching layer to the frontend where one didn’t exist before. How much of your test suite would have to change to get there? If you have to re-write the test suite during each of these changes, what’s the point of it?

      1. 3

        As a though experiment, picture moving your application from Rails to Node.js, or a server-rendered app to a single-page app.

        I like “neural network” universal metaphor here: suppose your entire app is replaced with a neural network which just conjures an answer out of a web of inscrutable matrices. Can you re-use your test suite to check that the application still behaves correctly?

        1. 2

          Yes, exactly (I’ve always been a big fan of your How to Test post :D).

          The way to do this is to have an abstraction over your entire application. You see this a lot in verification, where one of the most popular representations of a program is a state transition system, something like:

          type ('e, 's) program = ('e -> 's) -> 's
          
          let exec p s es = List.fold_left p s es
          

          i.e., your system is a state type and and event type, and its progress is defined as a function of the current state and the incoming event that produces the next state. Going into any more detail than this, and you start to commit to implementation details.

          (Btw, this definition is almost the exact one that seL4 uses in their verification effort - there’s is slightly more complicated to allow for non-deterministic execution. And it’s also pretty much exactly the Elm Architecture / Redux.

      2. 1

        It also begs the question, aren’t there some classes which are private / implementation details? Do those need their own tests too?

        Pet-peeve: languages and frameworks that make it hard to test stuff because it’s “private”. I don’t know about you, but I can’t write non-trivial code correctly unless I test it.

        1. 1

          I agree, this is one thing I think Go gets right. You can test from either within our outside of a package, and when testing within a package you can directly reference private symbols.

          I meant moreso that if you want to create an interface for your application, but also adhere to the “one test suite per class” rule, then you can’t create classes that aren’t outside of the external interface without testing them. Basically, you should be able to test whatever you want, sure, but you shouldn’t have to test something that you don’t want to.

          1. 1

            I’m not sure what you’re saying. Are you saying that if you’re writing a test suite for a class X you can’t instantiate objects of another class except through the public interface of X?

            That seems like a very strict reading of the advice. I’d interpret the advice as saying that you shouldn’t try to test class Y in the test suite for class X, but you can use it if you want.

            1. 1

              The advice says:

              I recommend that developers write their own unit tests, one per class.

              So I took that to mean if you have X and Y, you must test both. And there are cases where I wouldn’t want to test Y let’s say.

    2. 1

      I’ve been looking for something to quote about what the unit really, thanks. The video linked within the article starts off with the quote.

    3. 1

      Doesn’t this concept entirely depend on context though.

      I write/maintain two libraries in two different languages (with ideas/rumblings of a third, in a third language).

      In this case, the “units” that have behaviour to be tested are the public methods and functions, by nature of how a library works.

      I think in general the common understanding of what unit test means still “works”, just at a different level.

      Yes, in the hypothetical scenario where you make some massive refactoring change which is supposed to have zero change on the behaviour or functionality, you’ll need to work on new/changed tests too.

      But that’s definitely not the only kind of change that happens, and I’d argue it’s the least common these days.

      If you have unit tests for every public method of a class, and you refactor the internal logic of every one or several or all the public methods of a class, but you don’t change their signatures and don’t add/remove any methods, your tests are fine.

      The most common change I’d imagine though, is changing/adding features/behaviour - at which point you need to work on the test anyway.

      1. 2

        I like the term “change detector tests” — it sounds bad to a lot of people, but there’s some contexts when you do indeed want to detect changes. Library code is one. Also, in very mature applications its often desirable to add friction to existing components.

    4. 1

      Heh I always thought the “unit” must originally refer to physical units when you sanity check your equations. For example velocity should be m/s. But maybe that explanation is too cute for its own good.