1. 50
  1. 11

    The youtube 32 bit overflow wasn’t a cold path. There was no separate code path to handle numbers that didn’t fit in 32 bit, that was triggered for the first time by the Gangnam video. There were no edge cases or fallbacks that could have been avoided and there was no 3rd party to which ‘test capacity’ could have been offloaded.

    The same goes for the TLS example: no untested code path was involved. At most a 3rd party could be used to automatically extend/replace the certificate when the time comes (letsencrypt).

    1. 4

      Good points, I’ll remove them. I had other plans for the direction of the post, but I had to pare down to keep the focus and I missed this. There’s plenty of other examples to choose from.

    2. 10

      Some additional thoughts. Feel free to paste these into your article if you find them worthy :)

      Code coverage analysis is a good way to identify cold paths during testing, which in turn tells you what cases you aren’t testing and need to write tests for. As an extreme example: SQLite claims to have nearly 100% branch coverage in its test suite.

      Error handling is a common source of cold paths, since errors happen infrequently and can be difficult to trigger during tests. Worse, errors tend to cascade, so an error in a low level component can trigger multiple cold paths in succession. (This is one reason I prefer exceptions or Swift-style error results over manual error checking: the more of error handling that’s automatic, the less there is to screw up.)

      The really difficult thing, I’ve found, is paths that are cold only in combination. Path A might be common, and path B elsewhere might be common, but it’s rare to get A followed by B and that’s what triggers the bug. This gets really bad in concurrent or async code, where the order things happen in is nondeterministic. So maybe B followed by A works, and happens in tests, but one time in 10,000 A comes first…

      1. 2

        I like these kinds of blog posts. As an engineer (and a new one in the backend world), it can be difficult to weigh one solution to a problem up against another. This is another, very concrete, tool to do that evaluation that I will add to my toolbox. Thank you.

        Also, welcome back to blogging Mr Kellogh :)

        1. 3

          Thanks! Yeah, crazy how all the posts stopped flowing when my first kid was born. I used to write any idea that popped into my head, now I have to stay interested for an entire week or it won’t get completed.

        2. 2

          For a garbage collector, this means things like offering fewer options, or having a simpler model to avoid cold paths around promoting objects between generations.

          Don’t you need those options because different applications have different use-cases, which necessitate options for performance tuning?

          If system 2 is more reliable than system 1, then why don’t we always choose system 2?

          Isn’t this because, in general, system 1 (that is, the less-general-case) was chosen for a specific reason, like performance? As far as I know, it’s a common pattern to “optimize for the common case”, including making the common fast - then, if the common case fails, you drop down to the slower (but more general and robust) fallback.

          I’ve found that centering conversations around “avoiding cold paths” gives more clarity on how to proceed.

          I believe that “cold paths” equate to “infrequently used code” - which means features that only a small set of your users (but at least some of them) use. This seems to line up with the simplicity enthusiasts advocate for - reducing “complexity” in your code by (1) cutting out features or (2) making the program slower (by removing optimizations for special cases).

          I think that this also means that there’s no free lunch - given some fixed amount of budget that you get to spend on testing, and fixed developer discipline/good coding practices, you have to trade off efficiency+features with complexity-surface-area for bugs to manifest on. (although, it’s important to differentiate between accidental and essential complexity/cold paths - if you can eliminate the former, you do indeed get a free lunch)