1. 11
  1.  

  2. 5

    Yeah, people invent terms to describe types of testing with the hope of having the terms used across projects and companies. The trouble is that different “thought-leaders” invent their own terms and every project or company follows a different thought-leader, which means that different bubbles of programmers use different definitions for the same term.

    In a situation like this, it is reasonable to just drop the term and describe the type of testing using a hyphenated phrase.

    I appreciate the share of this relatively uncommon thought!

    1. 4

      Any time I hear this sort of debate I’m inclined to show hwayne’s excellent test terminology post (with Lobste.rs discussion no less).

      This doesn’t counter the author’s conclusion - we need to talk about what the test is supposed to accomplish, not which boxes it checks.

      1. 1

        Hmm. Not really happy about this.

        Let’s take a step back.

        I often liken testing to building a ship in a bottle.

        You have to explore and manipulate the entire contents of the bottle, only through the interface, the mouth of the bottle.

        The larger and more complex the inside of the bottle, the harder your task is.

        The more constrained your interface (narrower the mouth), the harder your task is.

        Yet to verify your code will work for all valid uses, you have to explore the entire inside of the bottle.

        Now if your function calls another function….. it’s like building a ship in a bottle, that is inside another bottle.

        Orders of magnitude harder problem.

        If you have a really simple bottle with a wide mouth…. it’s all easy.

        But for anything even slightly more complex, the only way to achieve the goal of verifying the correctness of the code, is to go for the smallest possible unit.

        What is the smallest possible unit?

        The smallest thing you can throw at your toolchain and produce a runnable executable test.

        Why are you running a whole web server?”

        No. I won’t chide you for…

        not doing real unit testing, for failing to use the correct magic formula

        I will do worse.

        I will merely ask you how did you explore the whole state space of your code? Are you sure you have verified your code will work when it is invoked with any parameters and any internal state that it is contracted to work for?

        If you answer Yes, and you can explain how… I’m happy.

        If you mumble and look at your feet and mutter about the number of atoms in the known universe, I’m less happy.

        If your tests take longer than the average rate of commits to the repo so you cannot cleanly decide whose commit broke the tests….. I’m getting quite disgruntled.

        If your tests break sporadically so it seems as if my commit broke the build…. I’m actually getting quite pissed off.

        1. 1

          Now if your function calls another function….. it’s like building a ship in a bottle, that is inside another bottle.

          Orders of magnitude harder problem.

          I disagree: calling other functions make our life easier, since we can delegate responsibilities and compartmentalise the problem. The ‘interface’ of our functions (parameters, calling convention, side-effects, error conditions, etc.) should be all we need to use them effectively. Checking that the interface is adhered to isn’t the caller’s responsibility, that’s the responsibility of that function’s own tests (although calling a function in a test does check some of its implementation, but that’s just a bonus).

          In other words, don’t build ships-in-bottles-in-bottles: build ships-in-bottles, then push one inside another (and take it back out for changes/fixes).

          1. 2

            Certainly the path to building defect free systems is to assemble them out of defect free components.

            However, to verify a function you need to be able to explore it’s state space. ie. Global, static and instance variables only via the parameters of the methods that manipulate that state space.

            This is why functional programming folks hate hidden mutable state.

            Your program needs to work for values of internal state the user has access to from the initial state.

            But verifying that is hard.

            As an extreme example, I have a gadget on my desk with a couple of buttons and an RF interface.

            It’s internal state is in the multimegabytes. ie. No matter how much or how long you tested via those couple of buttons… you still have only explored an absolutely negligible fraction of the states that the user may legitimately access.

            Or to go back to the bottle analogy….

            It all depends on the shape of the bottle.

            If you have a wide mouth, shallow bottle (ie. The interface is as broad or broader than the state space inside the function), yes you are right.

            If you have a narrow mouthed deep bottle, and your inner bottle is narrow mouthed and deep… a substantial part of the state of your outer bottle resides inside the inner.

            Thus to verify your outer bottle, you have to also explore all parts of the inner bottle that may be reached via the mouth of the outer.

            And that is very hard.

            An alternate approach is use contract tests to verify that the outer and inner functions agree on their interface. I typically use this approach for services like I/O, timers, threading primitives etc.

            Of course, the best approach is to minimize the amount of mutable hidden state….

        2. 1

          I’ve run across similar issues too, which I wrote a blog post about.

          To me, the main “camps” seem to be domain-based vs. implementation-based. For example, the implementation-based perspective maps testing terminology on to programming language features: a “unit” is a method, a “dependency” is a class, etc. The domain perspective is based on semantically meaningful divisions, e.g. a “unit” might be “logging in”, since we can’t “half log in” regardless of how much code is involved in the process.

          Cross-talk between those using different definitions ends up taking a reasonable practices according to one definition and applying it in places where’s it’s less appropriate (e.g. that all “dependencies” should be mocked, which gets ridiculous if we take “dependency” to mean “any class”).