1. 15
  1.  

    1. 18

      This post brought to mind a particularly egregious case of developers avoiding writing tests due to time pressure, which was the first team I joined at AWS. I had worked at a few startups and smaller companies beforehand, and I was excited to see how things were done at a “real company.” The team I joined was just about to launch a product, and was excited to have a startup person onboard. The product was not particularly challenging technically, but had to be very flexible, and thus had a lot of customization options which lead to a ton of permutations in the frontend.

      And… not a single frontend integration test, or really test of any kind in the frontend. I joined this team to help them launch, because they were so bogged down with bugs and issues that they needed anyone (including experienced Backend devs who have no business writing frontend code, like myself) on the problem to help them out. The same issues kept cropping up. I fixed the same bug three weeks in a row because at the end of each sprint, it would be broken again because someone else had fixed a bug in a different component. Of course, I was horrified and, having worked closely with frontend developers previously, knew that even though integration tests were a massive pain (as compared to backend integ tests) they were absolutely critical – especially for complicated interactions with a lot of permutations.

      But the deadline was there. We just needed to make the deadline, then we could add tests. Management understood, they understood that fixing the same bug 3 times in a row is taking a significant amount of time. We hired 6 more engineers. Now we had to train the new engineers, which was adding 1/3 to the size of the team which was already having a hard time coordinating who was fixing what. So we never added tests, even as the first “Must Hit Deadline” flew by and the second “For Sure Last Deadline” was coming up, with the only thing holding back our launch being a buggy ass frontend.

      The end of the story is that the team tried to launch a super buggy product, it was pulled down, a new, new deadline was decided upon and I left the team for greener pastures elsewhere in the company. And that is when I learned that our entire industry seems to be built on this idea that there just simply isn’t enough time to build a good product.

      1. 25

        We hired 6 more engineers. Now we had to train the new engineers, which was adding 1/3 to the size of the team which was already having a hard time coordinating who was fixing what.

        If you write this down three times, I believe Fred Brooks will rise from the grave just to give a shameful head shake at the whole situation.

      2. 4

        that is when I learned that our entire industry seems to be built on this idea that there just simply isn’t enough time to build a good product.

        This connected some dots for me: it’s why you see some devs seem to have given up on the idea of shipping quality code.

        1. 4

          It is unfortunate that our industry breaks so many developers in that way. I also think it is a misunderstanding of the tradeoffs involved in engineering. Quality and time are not necessarily inversely correlated. For instance, having quality infrastructure can make developing new products faster than if that infrastructure was hastily developed. Additionally, high quality code is quicker to work with and expand with new features. I don’t think anybody would argue that a PoC should not be built rapidly, but once the concept has been proven I have never seen it be built properly afterwards from the lessons learned, despite at that point quality determining the future speed of development.

          Personally, I think the real issue is that writing high quality code requires training, and training requires an investment in employees. With the average tenure of a software developer being something like 3-4 years, this investment is rarely seen as worth it. But I think the short tenure of so many software developers is not due to some inherent aspect of software engineers, but simply because so few jobs allow employees to properly grow and thrive because they are inundated with dealing with poor quality infrastructure and products (that they may have even built, because they weren’t properly trained).

          However, I don’t think its all doom and gloom. Its a new industry and is going through growing pains, and I think once the market is saturated with crappy code then quality code will be rewarded and it should correct itself. I am hopeful we are reaching the inflection point, but its unfortunate how many good developers we will have lost who have fallen to the idea that you can’t ship quality code.

          1. 2

            Lovely comment, thank you for following up.

            I think my generation of developers (I’m 41) has done a great disservice to the younger generation of developers by allowing them to become so jaded and disconnected from both the beauty of the craft as well as the value they bring. I’m not sure how to rectify this, but I hear too many devs that simply do not believe that it is possible to satisfy business needs with decent (let alone very good) code.

            I like to think I’ve made a career out of that very thing.

      3. 2

        Your story seems the opposite of the hypothesis in the article: not writing tests prevented launch. The article describes a scenario where developers are able to launch code with bugs into production.

        I assume you had dedicated QA resources who were catching the bugs and preventing release?

        1. 9

          I think both are just about the systemic issues facing software development, where engineers are not incentivized to write tests. The author talks about missing sprint deliverables as looking bad for the individual developer, and my example is taking that same concept and extrapolating it up a level (missing the project deadline is going to look bad for the team). In both cases, tests are avoided because they are perceived as slowing down development. When in fact the exact opposite is the case: good tests speed up development by allowing safe, incremental refactoring and catching bugs ahead-of-time (which is always easier and faster to debug).

          1. 3

            good tests speed up development by allowing safe, incremental refactoring

            This is a just so story we like to tell ourselves but I don’t know that it’s true.

            catching bugs ahead-of-time (which is always easier and faster to debug).

            Eh only if you write the correct tests AND you don’t have other systematic types of testing.

            1. 9

              The key word here, which is doing a lot of heavy lifting, is “good.” Bad tests will slow you down like nothing else; especially ones which mock the world, and thus any tiny change to the world requires fixing. A good test does one thing, and it does one thing only: it enforces a constraint that you are unable to in the type-system. Pure functions are nice because they make it really easy to test exactly what you want without having to mock out a bunch of shit.

              If you have good tests, then any refactoring becomes changing what you want, and then seeing what constraints you have violated via the tests. Constraints that are no longer valid, delete the tests. The aim is to cover all of the solution space in a minimal number of tests, which compose together nicely. The more close you get to a perfect coverage, the more bugs are caught in the net before they can make it into a commit message or into production.

              Unfortunately, good tests are hard work and not usually rewarded. I’ve been on teams with a requirement for testing, but never a team which required quality testing (and have fixed my fair share of bad tests by simply deleting them). Many people seem to think high line-coverage means good testing, but if its all garbage tests then it is actively harmful.

              1. 3

                Absolutely agreed. I’d add that the constraint being enforced must actually be one that is wanted in reality and not reflect incorrect assumptions about the world. I’m sure other people will add other criteria (as you say, “good” does a lot of work).

                Unfortunately, I don’t think you can mandate good tests, and I don’t think you can have an incentive structure in place specifically for good tests. I do think you can easily mandate or incentivize bad tests. Which is why I say that the approach suggested at the end of the article is very Taylorist.

                All I think you can do is have teams that agree to a certain level of cohesion and discipline and ask them to adopt policies and practices that work for them. (As you may guess I’m a big fan of “original” agile and not cargo cult agile)

                1. 3

                  Unfortunately, I don’t think you can mandate good tests

                  We can’t, but we may be able to mandate good software. Specifically, punish developers more for bugs than delivering late. Of course deadlines are still a thing at the end of the day, but it’s a problem when I’m being scolded for slowly producing (nearly) flawless code, while my colleagues are praised for quickly fixing their crap.

                  (Totally not sour grapes…)

                2. 3

                  Unfortunately, I don’t think you can mandate good tests, and I don’t think you can have an incentive structure in place specifically for good tests.

                  The incentive structure for good tests is simply the same incentive structure for high quality software. You can’t write good software if you aren’t verifying your constraints, and you’re not going to get very far doing that manually. Unfortunately… I don’t think many companies incentive writing high quality code. Bad code written quickly sells.

                  I do think we will reach an inflection point where the market is saturated with poorly written products, and quality will dominate, but until then quality will only be found in the niches.

            2. 4

              This is a just so story we like to tell ourselves but I don’t know that it’s true.

              I’ve worked in codebases with no test coverage, bad test coverage, and good test coverage. Absolutely no contest which one I’d rather work in.

              Eh only if you write the correct tests AND you don’t have other systematic types of testing.

              You can’t just do this robotically. You have to use testing as a tool to improve your code, not as a box to check or a coverage percentage to hit. And pushing this off on QA only lengthens the feedback loop, which makes your software worse.

            3. 3

              good tests speed up development by allowing safe, incremental refactoring

              This is a just so story we like to tell ourselves but I don’t know that it’s true.

              I do. My sample size is tiny, but when I have a comprehensive test suite that I can run after every tiny little change, I can quickly and confidently release many tweaks and changes, while without that test suite I would just be incapable of changing anything without a significant risk of breaking something.

          2. 1

            You have to find a way to make it easy and intrinsic to the process to get full adoption. If you make it easier to write tests to deploy, then all your code will have tests. I agree with your conclusions.

            I say this as a very much hypothetical answer, as I don’t have concrete procedures to enforce this… but it would be way cooler if I did.

        2. 3

          dedicated QA resources

          This is the worst thing, both the literal phrase and the QA handover.

    2. 8

      Comments here remind me about testing under time pressure quote from Jamie Zawinski in Peter Seibel’s Coders at Work:

      Seibel: In retrospect, do you think you suffered at all because of that? Would development have been easier or faster if you guys had been more disciplined about testing?

      Zawinski: I don’t think so. I think it would have just slowed us down. There’s a lot to be said for just getting it right the first time. In the early days we were so focused on speed. We had to ship the thing even if it wasn’t perfect. We can ship it later and it would be higher quality but someone else might have eaten our lunch by then.

      There’s bound to be stuff where this would have gone faster if we’d had unit tests or smaller modules or whatever. That all sounds great in principle. Given a leisurely development pace, that’s certainly the way to go. But when you’re looking at, “We’ve got to go from zero to done in six weeks,” well, I can’t do that unless I cut something out. And what I’m going to cut out is the stuff that’s not absolutely critical. And unit tests are not critical. If there’s no unit test the customer isn’t going to complain about that. That’s an upstream issue.

      I hope I don’t sound like I’m saying, “Testing is for chumps.” It’s not. It’s a matter of priorities. Are you trying to write good software or are you trying to be done by next week? You can’t do both. One of the jokes we made at Netscape a lot was, “We’re absolutely 100 percent committed to quality. We’re going to ship the highest-quality product we can on March 31st.”

      1. 15

        Anyone who ever used Netscape can confirm that its quality was exactly what you would expect from this quote.

        1. 2

          And by using Netscape over its quality assured competitors, you probably proved his point :) At least we live in a time period where products evolve at a great speed, making looking back fun and insightful.

      2. 4

        Is there is any situation where I really, really want tests is that kind of frantic situation where everyone tries to pile up as much features as they possibly can in a short amount of time. Not for every feature, but definitely for the core stuff. Just to avoid the situation where a user could not open the app anymore because someone changed the version number in the About dialog. And everyone was too busy to check it before shipping.

    3. 7

      With tests, I can refactor my code. Without tests, your code will get worse and worse because you don’t feel confident enough to refactor

      I wonder to what extent this is true. Poor quality tests that test the implementation is what it is also inhibit refactoring.

      How do we know that codebases without tests are less refactorable? Not being refactored is evidence, but weak evidence - it might be a sign that refactoring is not needed.

      How do you get out of this? Tests need to be non-negotiable. Tests are part of being a professional engineer.

      How very Taylorist. The high paid people who don’t do the work will tell the plebs exactly how they should do the work.

      1. 10

        How do we know that codebases without tests are less refactorable?

        Experience. Every large codebase I’ve seen has had a quagmire somewhere or other with the following properties:

        • It’s bad. In the sense of much more complex than it needs to be, and filled with bugs, and being poorly tested, and probably also being extremely difficult to test even if you wanted to (e.g. from mixing state and IO with logic).
        • Everyone knows it’s bad.
        • Everyone is afraid to touch it, because if you touch it stuff will break. The normal way you feel comfortable modifying code is either from tests, or from understanding it, but the quagmire supports neither.
        • Many people know some ways to make it less bad, but doing so would (i) require a ton of time, and (ii) likely break stuff in the process.

        Thus no one makes changes to the quagmire except for really local patches, which over time makes it even worse.

        1. 1

          Compare Postgres vs. SQLite’s respective willingness to undertake invasive refactorings, and then compare each project’s test coverage. Just one data point but I think it’s telling.

        2. 1

          Ok but that doesn’t tell us that codebases without tests have this property of being a quagmire. That tells us that many quagmires have no tests.

          In my experience useless tests can make this even worse.

          1. 3

            Yeah, there are two opposite failure modes: (i) not testing tricky things, and (ii) over-testing simple things, and tying tests to the implementation. Both are bad.

            EDIT: I’m having trouble telling if you’re arguing against having any tests at all. If you are, have you worked on a codebase with >10,000 LOC? At a certain point, tests become the only reliable way to make sure complex parts of the codebase work correctly. (It remains important not to test trivial things, and not to tie the tests too closely to the implementation.)

            1. 1

              I’ve got files with several thousand lines of code. Believe me when I say you have to test things, but automated tests are not necessarily valuable for all codebases.

        3. 1

          The problem is that you’re expecting the same developers who wrote the awful codebase to write the testing suite. Whatever reason for the terrible code (time pressure, misaligned incentives, good ol’ ineptitude) still apply to the development of the tests.

          Meaning if the code is really bad, the tests are going to be really bad, and bad tests are worse than no tests.

      2. 3

        How do we know that codebases without tests are less refactorable?

        In my case, 20 years experience, 11 of which have involved decent-to-good test coverage (I don’t mean just line coverage, but also different types of tests to verify different aspects, such as load tests). In a well-tested code base, some tests will simply break, ideally within milliseconds, if you do anything wrong during a refactor. In an untested code base an unknown number of interactions will have to be manually tested to make sure every impacted piece of functionality still works as intended. And I do mean “unknown”, because most realistically sized code bases involve at least some form of dynamic dispatch, configuration, and other logic far removed from the refactored code, making even the job of working out which parts are impacted intractable.

        Having automated tests is such a no-brainer by now, I very much agree with the author. But the author isn’t telling developers they need to write tests, they are telling managers they need to allow developers time to write tests.

        1. 5

          Perhaps for a better A:B example:

          I have some things in LLVM that have good test coverage, and some that don’t, and some that do but only in a downstream fork. People routinely refactor the things with good test coverage without my involvement. They routinely break things in the second and third category, and the third is usually fairly easy for me to fix when I merge from upstream.

      3. 3

        How do we know that codebases without tests are less refactorable? Not being refactored is evidence, but weak evidence - it might be a sign that refactoring is not needed.

        I once had to upgrade an existing site (built by someone else) to newer versions of CakePHP because of PHP support (their hosting provider stopped supporting old PHP versions, and that version of CakePHP would break). In which they’d completely overhauled the ORM. The code contained very tricky business logic for calculating prices (it was an automated system for pricing offers based on selected features). None of it was tested. In the end, I had to throw in the towel - too much shitty code and not enough familiarity with what was supposed to be the correct behaviour of that code.

        The company had to rewrite their site from scratch. This would not have been necessary if there had been enough tests to at least verify correct behaviour.

        In another example, I had to improve performance of a shitty piece of code that a ex-colleague of mine had written. Again, no tests and lots of complex calculations. It made things unnecessarily difficult and the customer had picked out a few bugs that crept in because of missing tests. I think I managed to “test” it by checking the output against a verified correct output that was produced earlier. But there were many cases where code just looked obviously wrong, with no good idea on what the correct behaviour would’ve been.

        1. 1

          In the first case it sounds like the core problem was business logic intertwined with ORM code, and secondarily a lack of indication of intent. Tests would actually have helped with both.

          And in fairness the second one also sounds like tests would be the solution.

          1. 3

            Indeed. Now to be fair, both were shitty codebases to begin with, and the same lack of discipline that produced such code resulted in the code not having tests in the first place. And if they had written tests, they’d likely be so tied to the implementation as to be kind of useless.

            But tests are what make outside contributions more possible by providing some guardrails. If someone doesn’t fully grok your code, at least they’d break the test suite when making well-intentioned changes. Asserts would also help with this, but at a different level.

      4. 3

        With tests, I can refactor my code. Without tests, your code will get worse and worse because you don’t feel confident enough to refactor

        I wonder to what extent this is true.

        True enough that my job the past 2 months was about porting a legacy component that meshes very poorly with the current system (written in C instead of C++, uses a different messaging system…). Having re-implemented a tiny portion of it I can attest that we could re-implement the functionality we actually use in 1/5th to 1/10th of the original code. It was decided however that we could not do that. Reason being, it is reputed to work in production (with the old system of course), and it basically has no test. To avoid the cost of re-testing, they chose to port the old thing into the new system instead of doing it right.

        I’ve heard some important design decisions behind the Boeing 737 MAX had a similar rationale.

      5. 1

        Poor quality tests that test the implementation is what it is also inhibit refactoring.

        Honestly it’s not that hard to make tests that test behaviors and contracts. I’d even say it’s harder to write tests that are so tied to the implementation that it makes refactoring hard

    4. 6

      Not writing tests is bad for developers, because it creates time pressure.

      More specifically:

      Tests can only tell developers they made a mistake. There is no gain at that moment. For developers under pressure, it’s better for bugs to be found in production than during development when business pressure is high.

      This is exactly the reason I found why test after is bad. Test after is subjectively and emotionally bad for exactly this reason. You’ve already written the code. You feel good. The test can only make you feel bad.

      1. I wrote some code. Yay!

      2a. I wrote a test and it passed. What a waste of time. Boo.

      2b. I wrote a test and it failed. I suck, my code sucks, testing sucks. Boo.

      There simply is no upside in test-after. And a lot of anti-test sentiment is anti-test-after.

      To avoid this, you absolutely need to write the test first. Then you only get good vibes.

      1. Hey, I wrote a test, and it’s red. Yay!
      2. Hey, I made the test green. Yay!
      3. Hey, I refactored the code, made it better and all the tests are still green. Yay!

      Does this mean I always write the test first? No. But most of the time I do, and not only is it better for me (the way a bitter medicine might be), but it also makes me feel so much better, and makes coding fun. Even under time pressure. Not, scratch that: especially when you’re under time pressure.

      1. 6

        I feel differently.

        Turns out that writing Monocypher, my cryptographic library, was the first time I really really had to get it right, and I knew I couldn’t just wing it. So I wrote a nice automated test suite. And then bug reports started coming in, and I learned that my test suite was just full of holes. I quickly developed a process for dealing with bugs:

        1. Reproduce the bug, add a relevant test case.
        2. Fix the bug.
        3. Assess what kind of bug I got.
        4. Assess what kind of hole my test suite had.
        5. Scour the entire test suite for similar holes elsewhere, and plug them.

        I didn’t have that many bugs, but I’ve had enough (and enough close calls), that this project basically taught me how to make a proper test suite. Stuff like testing for every possible lengths including zero, property based testing tricks, sanitisers… I’m still not a fan of TDD (which by the way wouldn’t have been nearly enough for Monocypher), and I still don’t write my tests first.

        But I know deep in my guts that I cannot reasonably ship code I haven’t written an automated test suite for. So here’s what I actually feel:

        1. I wrote some code. Oh god I hope it works.
        2a. I wrote a test and it passed. Phew, this part works.
        2b. I wrote a test and it failed. Phew, glad this didn’t make it to production.
        3. I wrote an entire test suite and fixed all the bugs. Done at last.

        1. 4

          Getting hung up on test-first vs test-after is a waste of time in anything but the medium-to-long term.

          Testing is a dialectical relationship between test and implementation, and as long as you are constantly shifting your perspective from one side to the other, it doesn’t matter where you started.

          1. 2

            If test-after works for you: more power to you.

            However, it seems to rarely work well, and there are actually good reasons for that. In fact, they are the exact reasons given in TFA for why testing in general doesn’t work (well a subset). It turns out that those reasons are not, in fact, reasons why testing doesn’t work in general, but why test-after doesn’t work. And it turns out that pretty much invariably when I interact with people who say that TDD doesn’t work, it turns out that they’re doing test-after and falling into the traps.

            1. 4

              The tendency of developers to view testing vs implementation as rigid antitheses is what I mean to critique here. The conversations about TDD tend to sound to me like microdosing waterfall.

              Which part you begin with is unimportant so long as in practice they are interleaved together during the development process and mutually influence each other. I don’t consider tests and implementation as distinct from one another at all.

              I think we broadly agree on the fact that development has not concluded, nor even really gotten underway, if no tests have been produced.

              1. 2

                I have to admit I am struggling a bit with what you are trying to say.

                The interleaving of test/code/test/code is a fundamental feature of TDD and test-first development. However, that does not mean it doesn’t matter where you start, for the various reasons I have described.

                Do you mean writing a whole bunch of tests and then commencing to code? Yeah, that’s a bad idea. In fact, my unit test framework usually places the tests in the class under test (on the class side).

                MPWTest: Reducing Test Friction by Going Beyond the xUnit Model

                MPWTest Only Tests Frameworks

                1. 2

                  Perhaps we are talking about different things, so I will clarify: every time TDD comes up, I see time after time people saying that you must adhere to a specific timeline of

                  1. write test for thing that doesn’t exist
                  2. see test failure
                  3. create thing
                  4. test pass
                  5. write test for interface that doesn’t exist
                  6. see test failure etc

                  There are likely nuances between statically compiled languages and dynamic, but this sort of rigidity in process I find to be unnecessary and even counterproductive. I frequently do quick prototyping of interfaces outside the test framework before moving to that when I’m not clear on what I want yet, and this is a perfectly valid approach to writing testable code.

                  Looking at your actual process:

                  1. add name of test to +testSelectors,
                  2. hit build to ensure tests are red,
                  3. while Xcode builds, add empty test method,
                  4. hit build again to ensure tests are now green,
                  5. either add an actual EXPECT() for the test,
                  6. or an EXPECTTRUE(false,@“impelemented”) as placeholder

                  In a language that lacks a REPL you are going to need to get feedback from different places. My process is much more LISP/Smalltalk alike, but that is not always possible.

                  I think our disagreement probably comes from language features more than anything, so it’s important to bear in mind whether you’re coming to TDD from a statically-compiled or dynamic perspective.

                  1. 1

                    OK, I understand much better now. I think you’re viewing the TDD process a bit rigidly, and, somewhat ironically, with a much harsher distinction between testing and tinkering than there needs to be.

                    I am actually a Smalltalker, for example you can still see a good chunk of my code in Squeak. My Objective-S language leans heavily on Smalltalk. It was in fact initially called Objective-Smalltalk and the domain still is objective.st. So explorative programming is very near and dear to my heart.

                    Over time, I’ve found that I can just as readily do the tinkering I used to do in a Workspace or a REPL in a test. With the benefit that the information I gather during tinkering is not lost, but is captured and preserved in that test. Of course I can still discard that test later if I feel it is no longer useful, but initially it is there and all that information accumulates.

                    In fact, I found a technique I call “test only development”. In TOD, the “write the production code” step of normal TDD is removed. Instead, you write a test and it just includes the production code. Then you factor out the production code from your test. Since that extracting a method is a behavior-preserving and mostly mechanical transformation, it makes the whole coding process even easier. And it works particularly well for integration tests that run against some sort of external API.

                    In the context of Objective-S I am looking into tooling support to make that process even more fluid.

                    1. 1

                      I think you’re viewing the TDD process a bit rigidly, and, somewhat ironically, with a much harsher distinction between testing and tinkering than there needs to be.

                      Well, I disagree with that because I have been arguing all along against rigidity in when and how tests are written, but I think it’s likely we’re just miscommunicating here.

                      Your comments on test-first vs test-after bring to mind a common line of argument on TDD that if you write any production code before the test assertion you’re not doing TDD. That is the viewpoint I mean to push back on, and I believe this was a misunderstanding. You clearly are not making that case.

                      Over time, I’ve found that I can just as readily do the tinkering I used to do in a Workspace or a REPL in a test. With the benefit that the information I gather during tinkering is not lost, but is captured and preserved in that test. Of course I can still discard that test later if I feel it is no longer useful, but initially it is there and all that information accumulates.

                      This is actually very similar to what I have in mind. I left certain details out for clarity, but a lot of my prototyping blurs the boundaries between tests and REPL by leveraging a REPL inside my test environment. I see them both as complimentary and overlapping processes.

        2. 1

          Hmm…not sure what you feel differently about. For example, the process you describe for dealing with bugs is clearly test-first: in step 1, you add the test case, in step 2 you write the code to fix the bug.

          Fixing holes in your test suite is an additional step.

          I’m still not a fan of TDD (which by the way wouldn’t have been nearly enough for Monocypher),and I still don’t write my tests first.

          Well, you do for regressions, so I am not sure what your claim is here, and why you feel that a process of adding the test first is somehow insufficient, whereas adding the test after somehow makes it sufficient.

          But I know deep in my guts that I cannot reasonably ship code I haven’t written an automated test suite for.

          I feel the same. And I also know not just in my gut, but also from the experience of countless examples, that when I write the code first, I fall into all sorts of traps, and coding also becomes a lot less fun. Test after is tedious, error-prone and anxiety-inducing.

          Note how the 4 stages you describe are: “1. Oh my god, 2a Phew. 2b. Phew. 3, Done”. Mostly not very happy places, and I know exactly how that feels. My stages are “Yay, yay, yay and yay”. And that is how that feels.

          Some of the traps are:

          1. Focusing on the solution before you’ve fully understood the problem
          2. Therefore mixing up problem-finding and solution-finding
          3. Making both quite a bit harder (and usually trying to do it all in your head)
          4. Creating a solution that is too general
          5. And thus often has behavior that is outside the range of the tests
          6. Creating tests that test the solution you came up with, rather than describing the problem to be solved
          7. Therefore often replicating flaws in your thinking and the solution code in the testing code …

          As I wrote elsewhere: Any testing is better than no testing, and if that’s how you personally get there, more power to you. But you are making life much harder and much less pleasant on yourself.

          1. 1

            Okay, I think we understand each other pretty well. Seems we disagree only about details.

            Note how the 4 stages you describe are: “1. Oh my god, 2a Phew. 2b. Phew. 3, Done”.

            Yeah, that was an exaggeration. It’s how I would feel if I were forced to ship code at step 1, and could only squeeze some tests in. If I know they won’t ship my code until I say it’s done, then each and every step feels like progress, which feels pretty good.

            I understand the advantages of using tests as a design tool, but it doesn’t apply perfectly, and sometimes not at all. My last bout of work for instance was implementing a legacy API. There was nothing to design, so tests couldn’t help me there. More generally, test code is not user code, so the best possible API for users is likely to be slightly suboptimal for tests — and vice versa. Plus, I always keep user code in mind when I design an API, sometimes even write it first.

            The only time this personal preference of mine actually did make my life harder was when I was working with a dynamically typed language (Lua, definitely not the worst of them). Putting stuff together only to be told I was adding functions or calling numbers, without being able to spot the source of the error at a glance, was a nightmare. That’s when I understood the power of TDD: it’s a way to be disciplined when the compiler isn’t. I have an alternative that works even better for me though: fuck dynamic typing.

            1. 1

              I think we understand each other pretty well.

              Hmm…doesn’t really look like it to me. Particularly since you not just cut out the bulk of what I wrote, but also ignored it in your reply.

              Yeah, that was an exaggeration.

              The depth of emotion doesn’t really matter that much, what matters is the “sign”. Anxiousness and, at best, relief vs joy, joy.

              I understand the advantages of using tests as a design tool, but it doesn’t apply perfectly, and sometimes not at all.

              While tests certainly are helpful as a design tool as well, though obviously not as the sole design tool or design driver, those weren’t the reasons I gave, so not sure why you bring this up as a reply to what I wrote.

              Having implemented legacy APIs and clients for legacy APIs at various points, I have found TDD to be extremely helpful there. In fact, for clients I have used “test only” development, as you can write the code you want by refactoring some integration/characterization tests. Very cool and safe.

              dynamically typed language

              Yes, it is true that good testing practices, and in particularly TDD, completely eliminates most of the purported advantages of static type checking. As testing tests values, it implicitly tests types as well, and since most type errors are fairly shallow, that just takes care of them. The converse is not true, as types do not generally check values. As a trivial examples. add( int, int ) -> int and sub( int, int ) -> int have the same type signatures, but ever so slightly different semantics as to what is supposed to happen to the values.

              My practical experience matches this. In fact, my first greenfield TDD project was in a statically typed language, the static types were not really of much help and certainly did not make the TDD approach superfluous. In fact, in one large refactor, the type-system was happy after 1 day of work, the tests after 3 days.

              TDD: it’s a way to be disciplined when the compiler isn’t.

              If you don’t know better, you can probably abuse it that way.

              fuck dynamic typing.

              Hmm…let me quote myself:

              What I find interesting is not so much whether one or the other is right/wronger/better/whatever, but rather the disparity between the vehemence of the rhetoric, at least on one side of the debate (“revoke degrees!”, “can’t go wrong!”) and both the complete lack of empirical evidence for (there is some against) and the lack of magnitude of the effect.

              The Safyness of Static Typing

              1. 2

                I think we understand each other pretty well.

                Hmm…doesn’t really look like it to me. Particularly since you not just cut out the bulk of what I wrote, but also ignored it in your reply.

                I generally ignore the parts I agree with. In particular when you noted that I do test first for regressions, and your 7 traps (I’m especially aware of the 7th).

                Yes, it is true that good testing practices, and in particularly TDD, completely eliminates most of the purported advantages of static type checking.

                I wouldn’t put it this way: dynamic typing requires more tests in practice than static typing, and the feedback we get from failed tests comes later than the feedback we get from types. This difference even trumps the presence or absence of a REPL. That later feedback in particular is critical: with a static type system, type errors are not only spotted immediately, they’re reported right next to the cause of the error. This isn’t always the same of tests with dynamic typing, unless they’re very fine grained.

                In my experience, dynamic typing requires more testing effort from me, and its edit/run/debug cycle is often longer than the edit/compile/fix cycle of (good) static typing, thus slowing down even my exploratory programming! On the other hand, I maybe twice in my career needed to write something even a good type system couldn’t quite express.

                You’d understand given such a one sided win for static typing that I chose to make up my mind and discount dynamic typing for the rest of my life. I’ve spent enough years comparing the two to reach a reliable conclusion: dynamic typing is not for me, and never will be.

                As a trivial examples. add( int, int ) -> int and sub( int, int ) -> int have the same type signatures, but ever so slightly different semantics as to what is supposed to happen to the values.

                That’s true of any signature that allows for more than one underlying implementation. Many signatures though are much more restrictive than this. Take map<a, b>(list<a>, (a)->b) -> list<b> for instance: the type system basically guarantees that map(l, f) is only made up of elements j=f(i), where i is an element of l. It doesn’t guarantee order, that no element is missing, or that no element is duplicated. But it does guarantee a lot, and you don’t need nearly as much testing to make sure it works that you would have needed in a dynamically typed setting. Thinking about it for about 5 seconds, I believe it would be enough in practice to just test that map(l, identity) = l for all lists of ordered integers of size 0 to 10.

                My practical experience matches this. In fact, my first greenfield TDD project was in a statically typed language, the static types were not really of much help and certainly did not make the TDD approach superfluous. In fact, in one large refactor, the type-system was happy after 1 day of work, the tests after 3 days.

                This would match my experience as well, though I would draw different conclusions. I bet the reason your type system didn’t help that much is because you wrote enough redundant tests the type system didn’t need in the first place. I would have skipped those tests.

                The Safyness of Static Typing

                Funnily enough, safety is way behind speed of development when I compare the two typing disciplines. Would OCaml be safer than Lua for instance? I suspect not by much: they’re both memory safe, and most, if not all, type errors can be caught with a reasonable test suite. Speed wise though, nothing beats the crazy tight feedback loop of a type system — and I’ve ignored runtime speed so far.


                Dynamic typing does have one clear advantage over static typing though: it’s easier to implement.

                1. 1

                  First things first:

                  I bet the reason your type system didn’t help that much is because you wrote enough redundant tests the type system didn’t need in the first place. I would have skipped those tests.

                  You’d lose that bet every time, because your analysis of that example is contradicted by the evidence of the example itself. Maybe you skipped over that part, or you didn’t grasp its significance?

                  If, as you seem to believe, a large part of the tests were redundant because they weren’t needed, then what would have happened is not what did happen. If what you think is true, the tests would have been green very soon after the type-checker let the code compile. But that was exactly what did not happen. The type checker was happy after a fairly short time (1 day as I wrote is more of an upper bound). The tests only turned green after two more days, so much longer than the types.

                  Therefore the tests, very obviously, tested a LOT of things that were not covered by the type-checker. They were not redundant. Because otherwise the tests would have been green very soon after the type-checker let the code compile. They were not.

                  And as I wrote before, the opposite of what you believe is true: when you have tests, most of the type-checks are superfluous, because you absolutely need to check values, and values implicitly and without any extra effort also check types. Not the other way around.

                  If you have checked that the value is 3, it is also clearly not of type PinkElephant. And just because you’ve checked the type does not mean that the value is correct.

                  In my experience, dynamic typing requires more testing effort from me

                  Not my experience, for the reason I just gave. Of course, some people try to duplicate the type-checker-style guarantees in the tests, because they want to feel saf-y. This turns out to not be necessary.

                  and [dynamic typing’s] edit/run/debug cycle is often longer than the edit/compile/fix cycle of (good) static typing,

                  That experience is very unique to you. It is contradicted not just by my personal experience, but also by common sense and every study that ever looked at this. Compile times for language with sophisticated type systems have gone through the roof, Scala is horrendous, Rust troublesome, Swift ridiculous.

                  So color me skeptical about this particular anecdotal report, particularly considering the, er, even-handedness you have revealed here so far [expletive deleted]. Although I can probably think of ways of getting to a result that is so far outside the norm, I’d really, really have to work at it.

                  1. 2

                    If what you think is true, the tests would have been green very soon after the type-checker let the code compile.

                    The tests that took 3 days to satisfy were definitely not redundant. But how many tests did you have that miraculously turned green right after the type checker was satisfied? And even assuming almost all tests took longer than a day to satisfy, would it have still taken you no more than 3 days to satisfy them all if you didn’t have the type checker to help you?

                    and [dynamic typing’s] edit/run/debug cycle is often longer than the edit/compile/fix cycle of (good) static typing,

                    That experience is very unique to you.

                    It’s not. There’s a bunch of us out there, and some have confirmed that their exploratory programming too is sped up by static typing. Not because compiling is faster than executing a test suite, but because compile-time errors are often much easier to fix than runtime errors, and static typing gives you more compile-time errors.

                    It is contradicted not just by my personal experience,

                    Obviously.

                    but also by common sense

                    Not sure what “common sense” you’re referring to. I only know we don’t have the same.

                    and every study that ever looked at this.

                    That is one strong claim. Could you cite 3? Or a relevant survey/meta-analysis?

                    Compile times for language with sophisticated type systems have gone through the roof, Scala is horrendous, Rust troublesome, Swift ridiculous.

                    I never used any of those. My reference would be the OCaml bytecode compiler, which on the small projects I worked with was basically instantaneous. And yes, anything slower than that is a problem. Sometimes I’m bothered even by how much time it takes to compile 2K lines of C.


                    It seems to me the divide between static and dynamic advocates is strong enough to hinder communication. Personally I can’t even conceive of a way of thinking that would lead to dynamic typing having significant advantages over static typing, or even being on par. I know many other people prefer dynamic typing, I just don’t know why. I’m also pretty sure the incomprehension is mutual.

                    It would be nice to resolve this, but I have given up hope at this point.

    5. 4

      This is a post about time pressure, not testing.

    6. 3

      Unfortunate title. The “bad for” in this title refers to the fact that the incentives to create tests sometimes get broken by management expecting too-early delivery and rewarding people for fixing bugs in production which could’ve been entirely prevented by those tests. A more accurate but less eye-catching title would’ve been something like “Rewarding production fixes rather than testing thoroughly creates perverse incentives.”

    7. 3

      Not strictly test related but this reminded me of speaking to a chap at RedHat, about 20 years ago, who said he struggled to meet feature freeze deadlines. His workaround was to ship known buggy code and immediately raise release-blocking issues against it.

    8. 3

      Tests validate design choices. If you are testing a feature and find that you have to do a lot of boilerplate for each case, this is telling you that there is some abstraction of your business logic you can extract from your code to simplify the system overall. The article completely ignores this aspect of tests, which IMO is as important as catching regressions.

      [Tests] can only detect bugs. Tests can only tell developers they made a mistake. There is no gain at that moment.

      Hard disagree. This is extremely valuable. There is a cost, measured in other people’s time and the overall quality of the product, to shipping bugs. But that cost needs to be communicated to any product people advocating for moving fast and breaking things.

      But when found with a test, the bug goes against the sprint commitment.

      This is the other side of the coin. It has nothing to do with tests per se, because you could say this about any good practice (whatever you think that means). A hopefully uncontroversial example: doing software design. Let’s say this means taking an hour or two at the start of a sprint (or whatever) to think through what you’re going to write before you write it. If there is a lot of time pressure to deliver the feature, it’s plausible, even likely, you will be asked to justify that time. That’s not because design work is somehow bad for developers. It’s because it’s your job to communicate the value of a robust process.

    9. 3

      They [tests] can only detect bugs. Tests can only tell developers they made a mistake. There is no gain at that moment.

      This is false. If you write tests first, they can tell you what your code is missing and, assuming you are testing in the right way, when you write code, the test tells you that your code does what you expect. Tests are a way to run your code in your dev environment and know that it works. If you don’t have tests you are either clicking around to check (this is slower than tests) or you just aren’t checking that your code works (which is somewhat negligent).

      1. 3

        Yeah, red-green-refactor TDD is a great way to get value from tests immediately. Once the test goes green the bulk of the work is done. Like, actually done, not the “done” where you end up going back to the same code eight more times in the product’s lifetime.

        1. 5

          Using TDD makes it no more likely that you’re not revisiting code. It merely ensures that the code that you write is covered by a test - nothing more. It’s a great tool to have available, particularly for things like bug fixes or when certainty is already high when implementing functionality, but let’s not pretend that TDD somehow solves every problem.

          1. 2

            Using TDD makes it no more likely that you’re not revisiting code.

            My experience says the exact opposite. If you disagree, then I don’t have any evidence other than my own experience, which understandably isn’t going to convince anyone at face value. It’s pointless to continue this discussion.

            let’s not pretend that TDD somehow solves every problem

            That’s a straw man, and you know it. I never said that, I wouldn’t say that, but TDD has solved a class of problems which I used to think were simply how development was done, and which I almost never have to deal with anymore.

            1. 2

              You’re right, I shouldn’t have said every problem - you didn’t imply that. Mea culpa. Guess that’s what I get for writing a comment at 2am when I’m up with a newborn.

              1. 2

                No worries, and genuine thanks for your measured response - too often in online discussions people escalate and double down.

                1. 3

                  Couldn’t agree with you more. Have to be the change that we want to see!

    10. 2

      When a bug is found in production, unless there are too many, it’s often blamed on QA

      Another good reason not to have a separate QA step.

      They are blamed for missed deadlines (Newspeak “Sprint and Goals”).

      I think this is the most toxic part of the development cycle at the moment but it also comes from developers doing a shit job at communicating what it is they are doing and why it’s necessary. In teams where there’s strong alignment between leadership and team members, I’ve never seen this to be an issue. People can talk with each other and come up with what is the best approach.

    11. 2

      you have to test

      you don’t have to write unit tests or automatic integration tests, but you do have to test. whether that means you’re running things manually and inspecting the output or teaching the computer how to do that for you. now, me, personally? I’m lazy and I’d rather teach the computer how to do my work for me. without fail every single time I have thought to myself “oh this isn’t worth writing the tests for, it’s just running this one command and then checking the output,” my velocity has gone up considerably when I finally bite the bullet, step back for a few minutes and write some tests