1. 27

  2. 12

    Yet what I found even more troubling was that in order to write effective tests, a programmer had to know all of the ways that a piece of software could fail in order to write tests for those cases. If she forgot the square root of -1 was undefined, she’d never write a test for it. But, obviously, if you knew where your bugs were, you’d just fix them. Errors typically hide in the places we forget to look

    I used to think like this but then I realised that tests are not about catching every kind of issue that could occur. The greatest value in tests is they ensure that stuff that once worked, continues to work. If someone makes a change and it breaks something that was working before, the tests will catch that. If you find a new bug that wasn’t covered by the tests then write a new tests that fails on that bug. Now you won’t have that bug go unnoticed again.

    1. 7

      Absolutely. The other thing? It is much easier to debug into a unit test than into application code that will only be called under very specific circumstances. So the “write a new test that fails on that bug” is the core of my debugging strategy.

      1. 1

        So regression tests, essentially.

      2. 5

        This is an interesting perspective. Especially the part where some people think writing some tests slows things down.

        Imo tests are a safety harness, don’t always work but they allow you to move faster without the PTSD that the author describes in their post. Tests are not 100% bulletproof, but they aren’t exactly useless either.

        1. 5

          I once stood in astonishment when I asked an engineer that moved within Google to Android how he liked his new team: “It’s awesome! We don’t have to write our own tests [like engineers on regular Google products do], so you write code so much faster.” When I immediately told him about my bootlooping Nexus 5x, his only response was a concerned “oh, it shouldn’t do that.”

          1. 3

            Wasn’t the 5x a hardware failure though?

            1. 1

              Most of the issues around the 5X were due to issues with heat dissipation, yup. Overheating would cause things to fail pretty badly; most of the time it meant the phone was CPU-throttled to keep heat down, but sometimes it would heat up so quickly on boot it would trigger a watchdog which would power down the phone.

            2. 3

              I used to work on Android, and I definitely had to write unit tests for some of the stuff I wrote. The tests we didn’t write were integration and UI tests; instead, we would write specs for QA people to run through on a daily or weekly basis. I would agree with your friend, though: not writing UI tests but still getting the benefits of having the UI tested was 100% a delight.

          2. 1

            It sounds like they could have benefited from Erlang’s let it crash philosophy. This scenario might have been avoided if the existing code hadn’t ignored exceptions the way it did. Had they not caught the exception and done nothing the bug would not have remained in the code undetected. It would have been caught during development or QA and addressed sooner rather than later. And even the bug would have made it to production would have been much quicker to find and fix had the exception not be caught and ignored.

            1. 0

              Incredible! So, uh, why did it catch on fire? Maybe I missed it in the article. I know why the servers would run hot. Now we gotta get from hot servers to fire.

              1. 4

                I lean towards hyperbole. Also, the photo seemed to be a stock photo of a fire fighting exercise: the thing burning looked like a mock airplane.

                1. 1

                  Makes sense. I was hoping for it to be literal so I have another case study on the THERAC list of disasters. Oh well.