1. 13

  2. 4

    As an SRE-type-person, I resonate very much with this post. I’m very used to treating software as a system I have to observe and experiment with rather than understanding it in detail.

    A large part of this is the complexity discussed in the post, but another part is just time. While I strongly prefer to understand systems in detail, there’s so much software I only interact with in the context of an outage. Sometimes an outage is the first time I find out a particular software package exists. If the config options and metrics are fairly obvious, it’s sometimes faster to treat software as an experimental system, and save reading the source for the post-mortem follow-ups.

    Admittedly, I trained as an experimental physicist, so in certain ways this type of reasoning is very familiar to me. ;-) Reading the source to troubleshoot a problem with code I didn’t write felt a bit like cheating the first time I did it.

    1. 2

      This is a place formal methods can be crucial. Writing a formal spec dramatically raises the ceiling on how much of the high-level system you can understand and error-check. And “the data pipeline still works even if two nodes go down” is bog-standard stuff to specify.

      1. 3

        Formal methods seem useful enough for the distributed system. For the machine-learning projects, they seem useless, because half the time the whole reason you employ machine learning is because nobody has a formal mathematical model for “this picture is a duck.” They also seem useless for “big balls of mud,” since they are defined by being too-late to make good overarching decisions like that, and browser JavaScript, where software failing to match the supposed “specification” (either because it’s Internet Explorer 11 or because it has an unforeseen browser extension) is the only reason it’s so hard.

        1. 1

          Which place? From your example, I’m guessing you mean distributed systems, but the post covers a lot of ground. Formally specifying machine learning tasks is a horse of another color. Formal specs of web browser behavior? Good luck!

          Either way I think it’s important to distinguish between techniques that are useful at design time vs early implementation vs maintaining a deployed and running system, especially a big hairy one that’s grown organically over many years and many contributors. These can be radically different scenarios.

          1. 1

            Formal specs of web browser behavior? Good luck!

            Some people are proving things as complex as C compilers, so… why not?

        2. 2

          Two claims are being made here:

          1. That many of our systems – distributed cloud systems, big balls of mud, client side JS in the real world – represent wicked problems
          2. That you cannot deal with wicked problems using methods that require complete system understanding.

          Claim 2 is undoubtedly true. Claim 1 seems highly suspect to me: In many cases these seemingly wicked problems are the result of the incompetence of the design. That is, we create complex, impossible-to-understand software solutions that could have been solved more simply, so that the resulting system was comprehensible.

          You may say, “Well, fine, but the complex system exists now, and we need to use these other methods to understand it,” and that’s fair in a way. But it also runs the risk of lending tacit approval to the root mistake that was made, or even suggesting that no mistake was made, that the situation of overwhelming complexity was “inevitable,” in cases where this was certainly not the case.

          1. 1

            As we scale up, I believe it becomes more important to have a system where we can ask targeted questions and have them answered, rather than understand or mentally emulate the system. Ideas that seem nice in the small - readable code that you “reason about” - don’t really scale up. A lot of complexity emerges as smaller systems compose and we’re way past human reasoning ability and our typical methods of dealing with it.