1. 17

  2. 28

    “Considering the factors mentioned above, we can reason that unit tests are only useful to verify pure business logic inside of a given function. Their scope does not extend to testing side-effects or other integrations because that belongs to the domain of integration testing.”

    Here’s the trick. It’s painful to unit test non-pure business logic or elaborate integrations, so pain in testing gives you a hint that your design could be better if you manage to avoid those. It’s worth checking to see if you can. In the example, the author has already settled on a design and discovered that it is hard to test, but the additional work of thinking about alternative designs that are more easily testable and have better cohesion and less coupling doesn’t appear to have been. Sometimes it works, sometimes it doesn’t, but it’s great that testability gives us feedback about design.

    In this case, I’d look into whether this can be done with tell rather than ask semantics. With objects, ask calls tend to increase coupling. ( https://martinfowler.com/bliki/TellDontAsk.html )

    1. 9

      Per Kent Beck: “I get paid for code that works, not for tests, so my philosophy is to test as little as possible to reach a given level of confidence” (https://stackoverflow.com/questions/153234/how-deep-are-your-unit-tests/)

      If your unit tests aren’t adding value, don’t write them.

      1. 4

        Ten or twenty years from now we’ll likely have a more universal theory of which tests to write, which tests not to write, and how to tell the difference.

        I don’t see any progress made in the last 12 years. How do you determine the value of a unit test you consider to write?

        1. 8

          There’s research on defect rates, what influences them, where to find them, but most software developers aren’t aware of it. A sample:

          • S. M. Olbrich, D. S. Cruzes, and D. I. K. Sjoberg, “Are all code smells harmful? A study of God Classes and Brain Classes in the evolution of three open source systems,” in 2010 IEEE International Conference on Software Maintenance, Timi oara, Romania, Sep. 2010, pp. 1–10, doi: 10.1109/ICSM.2010.5609564.
          • P. L. Li, M. Shaw, J. Herbsleb, B. Ray, and P. Santhanam, “Empirical evaluation of defect projection models for widely-deployed production software systems,” in Proceedings of the 12th ACM SIGSOFT twelfth international symposium on Foundations of software engineering - SIGSOFT ’04/FSE-12, Newport Beach, CA, USA, 2004, p. 263, doi: 10.1145/1029894.1029930.
          • Paul Luo Li, J. Herbsleb, and M. Shaw, “Forecasting Field Defect Rates Using a Combined Time-Based and Metrics-Based Approach: A Case Study of OpenBSD,” in 16th IEEE International Symposium on Software Reliability Engineering (ISSRE’05), Chicago, IL, USA, 2005, pp. 193–202, doi: 10.1109/ISSRE.2005.19.
        2. 4

          If your unit tests aren’t adding value, don’t write them.

          Or throw them away.

        3. 7

          TBH this whole article rests on the assumption that units have to be extremely small to count as unit tests. That obviously will lead to brittle and tautological tests. - so don’t do that.

          1. 4

            There is no formal definition of what a unit is or how small it should be, but it’s mostly accepted that it corresponds to an individual function of a module (or method of an object).

            Testing individual methods makes little sense to me. Usually there is some protocol for using multiple methods of a class. I would not test the File class methods open, read, write, close individually for example.

            1. 2

              I would not test the File class methods open, read, write, close individually for example.

              To clarify, do you mean you wouldn’t test those methods in code that uses a file class library or that you wouldn’t test each one in the library itself that defines the file class?

              I wouldn’t (directly) test, say, Python’s pathlib library in calling code, but I would definitely test those methods individually if I were writing pathlib. (I should add that I’m an amateur, so if you have reasons not to test individually, I’d love to hear them. My question is not rhetorical.)

              1. 2

                It isn’t even possible to test close without doing open, so you must do a combined test.

                Methods often come with expectations in what order you use them. You can only read/write/close after open. A getter returns whatever a previous setter got.

                1. 1

                  Thanks, that makes sense. I didn’t focus enough on individually.

          2. 6

            I like to say, the real bottom layer of the test pyramid is type checks.

            1. 2

              And when your language has.a strong enough type system, the unit test layer almost gets completely subsumed.

            2. 3

              Quite a reasonable article. With which I disagree in detail, but detail is detail.

              I think he’s right, unit testing is overrated, and I think I know why: It’s amazingly good for the things it handles well (“units”). And so people use it for purposes that other kinds of testing would handle better, and some of them notice that it doesn’t work very well, and draw the wrong conclusion.

              If you try really hard you can call anything a “unit”. See truths 3, 8 and 10.

              1. 3

                The problem in this scenario isn’t unit testing, it’s the units themselves. Your business logic code shouldn’t be calling services, it should be making decisions about values. Calling services to get the values beforehand is glue code.

                1. 2

                  It doesn’t even matter if a few layers deep there’s a bug in some part of the system, as long as it never surfaces to the user and doesn’t affect the provided functionality.

                  Am I misunderstanding or does this ignore the fact that bugs “a few layers deep” can be the cause of security holes or exploit vectors?

                  1. 6

                    I think the point is that as long as you can’t trigger the bug with interaction surface of the program (rather than plugging a mock into raw flesh), it doesn’t matter.

                    1. 1

                      There is something to be said to protect that invariant with precondition for these functionalities. When people start to actually use it, they will encounter the precondition, verify the implementation, fix it or simply remove the precondition.

                      Slightly off-topic, but that is also why I am always not entirely comfortable with current generics in the languages (especially the C++ template). It seems the way to express invariant is more limited and prone to errors (also, by being generics, you cannot really extensively “unit test” it).

                  2. 2

                    I find that I can get around 80% of the value of tests with around 20% of the number of tests (vs. what would be considered full coverage). Every project is different of course, but generally for me I do fairly detailed testing on sections of code that are complicated, or would have really unfortunate side effects if they had a bug. Everything else I cover with broad integration-style testing that catches regressions and stuff like that. My goal is to have “most” of the code executed in some way when the tests run, but it doesn’t have to be exhaustive to be worthwhile. Does that catch every bug? No, but it’s probably an order of magnitude less work that when I’ve attempted full coverage.

                    Disclaimer: My code rarely works with money and stuff like that, and is usually part of a startup or early product that still needs to be proven worthwhile, in those cases it’s usually the right call to trade some reliability for product iteration speed.

                    1. 1

                      Unit tests rely on implementation details

                      The unfortunate implication of mock-based unit testing is that any test written with this approach is inherently implementation-aware. By mocking a specific dependency, your test becomes reliant on how the code under test consumes that dependency, which is not regulated by the public interface.

                      This is absolutely false.

                      If your unit tests rely on implementation details, you’ve made a mistake. More specifically:

                      By mocking a specific dependency, your test becomes reliant on how the code under test consumes that dependency, which is not regulated by the public interface.

                      Your code can only consume that dependency via its public interface. This is the whole point of DI in this sense. Your are explicitly stating what your code depends on, and that is constrained by the public interface of the objects you inject.

                      T o work around this, you would normally inject spies, which is a type of mocked behavior that records when a function is called, helping you ensure that the unit uses its dependencies correctly. Of course, when you not only rely on a specific function being called, but also on how many times it happened or which arguments were passed, the test becomes even more coupled to the implementation.

                      Asserting that a collaborator is invoked is not the same as coupling to implementation. If I’m testing an object responsible for coordinating the steps of a user signup, say, and one of those steps is to send an email, testing that the EmailService.send is called with the expected arguments is part of the contract of that object.

                      Again, the whole point of the DI is to limit and specify the only way you are allowed to send an email. A test that the email service object is called doesn’t care how the application code accomplishes this (that would be coupling to impl detail) but of course it needs to assert that it was called. The big benefit here is that we have conceptually reduced “sending an email” with “invoking a method on an object”. This is a feature, not a bug.

                      Now you can separately have an integration test of your email service object to ensure that it actually sends email.

                      1. 1

                        Every damn blog post whinging about how unit testing and TDD doesn’t work…. reminds me of Reviews by people who don’t follow a recipe and then complain that it sucks.

                        I’ve been doing TDD and Unit Testing for years.

                        It works and works well.

                        But I have been learning.

                        Look at my early tests… They suck. They hurt.

                        So I changed what I did to remove the pain.

                        I read books and articles on how to do it better.

                        It stopped hurting and my code (not just my tests) improved.

                        1. 1

                          @pushcx commenting in the text area