1. 15
    1. 11

      I’m a little curious how this strikes others who work at this scale, or who use the ~contract/acceptance patterns the post describes?

      When I was reading it, I felt a little bit like E2E was getting unfairly tied to more general problems like “slow test suites create friction”, and “intrinsically flaky tests waste time and create uncertainty”?

      I clicked over to the Pact docs to see if that helped but I only ended up even more skeptical because the introduction promptly attacked what seems like a ~strawman to me:

      Do you set your house on fire to test your smoke alarm? No, you test the contract it holds with your ears by using the testing button. Pact provides that testing button for your code, allowing you to safely confirm that your applications will work together without having to deploy the world first.

      Like, huh? I test my fire alarm to make sure it works after I replace the battery. I guess it’s literally a kind of smoke test. I might do the same if I had been gone for a really long time, or if I lightly burned some pizza and noticed it didn’t go off. If the button test made one of my least-favorite noises and then still didn’t go off a few weeks later when something spills in the oven and smokes the kitchen up, you can bet I will indeed go light something on fire to test it out. Not the house, of course.

      Further, there’s a world of difference between how I might test my smoke alarm, and how someone who makes smoke alarms should test them. I do, in fact, want them to at least closely approximate lighting something on fire to make sure it works as they intend.

      1. 3

        My takeaways from this article is that E2E testing will take increasingly longer as your application grows (no surprise there), and that some people are not aware/forgot about of the Design by Contract (Bertrand Meyer, Eiffel’s author). The approach outlined in the article seems quite similar: express constraints on the input, on the outputs, and the state, which define the operational contract for the service.

        1. 8

          Design By Contract is a different kind of contract than here. This is more a technique to break integration tests into distinct unit tests and keep them in sync with each other.

      2. 2

        It’s also worth noting that if you get a smoke detector professionally tested then the person comes along with a little cone that goes over the smoke detector and spays it with a very small amount of smoke. It literally tests the end-to-end operation of the smoke detector.

    2. 5

      I think the thing which end to end tests cover is whether the assembly of correctly-functioning components gives a correctly functioning system. i.e. it is integration testing, not unit testing. It is testing that the assembly works, not just the components.

    3. 3

      I agree with the article that e2e tests can be a huge drag on productivity to the point that the time and effort wasted due to them can easily cause more bugs to be pushed to production. I still leave a warm spot for them, because a flexible e2e test setup can be a very valuable thing to have not only for testing but also for experimentation, but I think the main lesson here is to avoid e2e tests unless you’re willing to put the continuous effort to properly maintain them to be fast, reliable and easy to diagnose.

      BTW I’m willing to believe that at the 1k engineer scale it may very well be impossible to maintain those properties, never seen e2e at that scale myself. At some point one probably needs to see the overall system as a collection of interacting but independent systems, so maybe e2e might still be maintainable for some of those subsystems.

    4. 3

      I found that having loads of E2E tests often doesn’t add all that much; usually they’re all doing the exact same thing test after test after test, and since these parts tend to be fairly isolated there isn’t all that much that can go wrong in only that specific test. Either it works for everything, or it fails for everything.

      The way I’ve always viewed E2E tests is as “testing everything at the top layers” such as middleware, HTTP handling, and whatnot, which you can usually do with just a few (or sometimes even one) test. Other less high-level integration tests can test all the rest, and they tend to run much faster as they avoid a lot of overhead, and are a lot easier to write and reason about, especially if tests fail since you actually have some access to the system. Plus, it’s much easier to get code coverage (which I find useful to inspect which code is being run, not because I care about metric).

      I once rewrote a E2E test suite to use integration tests which gave a massive speed-up, and because the tests were a lot easier work with people actually started writing them. I added a few E2E tests (IIRC logging in, viewing the dashboard, logging out) and that was enough really.

      1. 1

        Agreed - you want a very few E2E tests that verify the basic core functionality works at all. “You can view pages and log in.” Even logging out seems like too much. Authoring content? Meh, an outage will be bad but bearable.

        1. 2

          Logging out seems useful because it’s a very “special” action, same as logging in. In general if I’d add E2E tests at all then I’d test every type of request, including authoring content by the way, or something like uploading files if that’s something your app has, as all of them do actually test something useful and have different code paths.

          If you’re going to do E2E testing at all then it doesn’t really matter if you have 2, 6, or 10 of them; all are manageable numbers (As opposed to hundreds or thousands of them).

          1. 2

            Uploading files is a good one.

            The point of E2E tests is to make sure you didn’t do something to the framework that breaks all functionality. As an example bug, a Rails middle ware that conditionally hijacks requests has its condition inverted. Or the JS minifier decided to output a file containing “error. Blah blah” instead of actual JavaScript.

            File upload paths are finicky enough that they’d be easy to break in a way only an E2E test could reasonably catch.

            The number of assertions in the test doesn’t really matter, what matters is runtime. And every request, every back and forth with the server, adds runtime to this commit-approval-blocking test.

    5. 3

      (I wonder how many good companies that Accelerate book is going to kill before engineering managers move on to the next shiny object.)

      More on-topic … this article seems to set up a false dichotomy between E2E tests and unit (or component) tests. Integration tests which exercise the whole system can be fast and not flaky if you replace external service dependencies like queues, HTTP transport, etc. with synchronous, in-process components.

    6. 2

      In our analysis, we figured out that the most frequent category of bugs caught by End-to-End tests was schema violations

      Is a schema violation something more subtle than “this was supposed to have a foo field but it doesn’t”? Or, “it was supposed supposed to be a string but it’s null”?

      How was that the common bug caught by end to end testing? Did they completely not have unit tests before? This reads like a straw man dreamed up by some kind of static typing zealot.

      (I say this as a static typing zealot.)

      1. 3

        How was that the common bug caught by end to end testing? Did they completely not have unit tests before?

        I think here it’s because the data is coming from a different service, so the E2E test fails if one service changes its data format and the client doesn’t. A unit test (or static typing) wouldn’t cover that because it’s different codebases.

        1. 1

          Is it common practice to change your end points in place?

          What I’ve always seen done is that each end point either never changes or only changes in backwards compatible ways. Then the server and client can independently use unit tests or types to check that they satisfy or understand the schema of that end point.

          Actually for monoglot code bases I’ve mostly seen services provide a client side library that handles the versioning internally and exposes an unversioned api to the client linking it.

      2. 1

        It might be something like “it’s supposed to be a valid NANP phone number, but it’s not” (experienced that one at work).

    7. 1

      The problem I often see with contract-style tests is that they’ll capture that you’re outputting something like a date range, but they won’t always think to test if, for example, that range is inclusive/exclusive. It doesn’t always capture semantics like an end-to-end test would. Someone has to think to test them deliberately.

      I think the way to get good quality end-to-end tests is that you don’t build your system and then blindly add end-to-end tests, you (try to) design the system so that it can be reliably and quickly end-to-end tested from the start.

      This probably excludes most micro-service designs (sorry… pet peeve!).