1. 34
  1. 17

    The pipe is one-character glue: “|”.

    I think that this is mistaken. Pipe makes it easy to connect the output of one program to the input of another, but that is not the same as “gluing” them - you have to do text processing to actually extract the fields of data out of the output from one command and convert it into the right format for input to another.

    This is why you see tr, cut, sed, awk, perl, grep, and more throughout any non-trivial shell script (and even in many trivial ones) - because very, very few programs actually speak “plain text” (that is, fully unstructured text), and instead speak an ad-hoc, poorly-documented, implementation-specific, brittle semi-structured-text language which is usually different than the other programs you want them to talk to. Those text-processing programs are the Unix glue code.

    The explicit naming of “glue code” is brilliant and important - but the Unix model, if anything, increases the amount of glue code, not deceases it. (and not only because of the foolish design decision to make everything “plain” text - the “do one thing” philosophy means that you have to use more commands/programs (which are equivalent to functions in a program) to implement a given set of features, which increases the amount of glue code you have to use - gives you a larger n, which is really undesirable if your glue code scales as n^2)

    1. 2

      Not sure if that could be a valid criticism to your comment, but I think there is a difference between the initial idea of program composition by people like Doug McIlroy, the creator of Unix pipes, and the way the auxiliary tools you mentioned were reified. So part of the accidental complexity brought with the mentioned ad-hoc formats is not very different from ordinary glue as described in the original post.

      So I might be missing something but the power is not so much in the single |—it is only an elegant token—but in | p_1 | … | p_n |. It can be converted to a mnemonic | q | and be reused, say, in p_0 | q | p_n+1. But I will come back to it later.

      This is something could not be done in traditional languages used for programming-in-the-large until very recently. OOP tried to address that in a way but ultimately failed. The very same logic applies: .f_1().f_2(). … .f_n(). can be converted to a mnemonic .g(). and reused in f_0().g().f_nPlus1().

      Trying to come back to the original subject, I think one of the reasons OOP failed is that, in order to make code reusable in the real world, programmers would have to account for all the relevant permutations of the intermediate parameters, which is impractical. It is easier (in the short term at least) to write glue code in the usual form. Maybe f_i() third parameter cannot be a FooController if f_j() first parameter is a BarView, or you should not call f_k() in-between both if you have BazView and BazController instead because it has some side effects you want to avoid. So you go ahead and write exactly the code you need. (Edit: or you write even more code, and create beautiful class diagrams to leverage this compositional approach. That will most likely account for the same net amount of code when compared to the “glue” approach, not to say anything about the extra amount of work.)

      Now, this is not to say that it doesn’t happen in the wonderful Unix universe where everything is a file, it does, but in this case, program composition is first-class, maybe due to evolutionary pressure: after all, the usual scenario in an Unix system has always been that you need to get two separate programs with not much shared logic other than OS primitives to talk to each other. That same kind of pressure did not apply to mainstream OO languages/environments. They were born in the middle of the personal computer revolution, and some even of predate public access to the Internet. Monoliths were the norm and, in this case, shared logic is a mere problem of code encapsulation.

      Well, I still need to sleep over this :)

      1. 2

        So I might be missing something but the power is not so much in the single |—it is only an elegant token—but in | p_1 | … | p_n |. It can be converted to a mnemonic | q | and be reused, say, in p_0 | q | p_n+1.

        Back in my university days, one of the class projects was a Unix shell, and for extra credit one could add conditionals and loops [1]. I already had a language I had written (for fun) and it took just a few hours to add Unix commands as a first class type to the language. So one could do:

        p1 | p2 | p3

        and not only execute it, but save the entire command in a variable and compose it:

        complex1 = p1 | p2 | p3
        complex2 = p4 | p5 > "filename"
        c = complex1 | complex2 
        exec (c)

        (That’s not the exact syntax [2] but it gets the point across—it also avoided the batshit crazy syntax modern shells use to redirect stderr to a pipe, but I digress). The issue is that I found it to be of limited use overall. Yes, it was enough to get me a “A+, and stop with the overkill” on the project, but that was about it. Once a pipeline is composed, then what? If it does a bit too much, it’s hard to shave off what’s not needed. If you need to do something in the middle, it’s hard to modify.

        Another example—I use LPEG [3] and it composes beautifully, but there are downsides. I deal with URIs at work, and yes, I have plenty of LPEG code to parse the various ones we deal with (sip: and tel: which have their own structure). I was recently given a new URI type to parse, partly based off one we already deal with. But I couldn’t reuse that particular LPEG because the draft specification we have to use lied [4] and I couldn’t use composition, but “cut-and-paste”. Grrr.

        I also try to apply such stuff to work. Yes, the project I’m involved with has a bunch of incidental complexity, but that’s because creating a thread to handle a transaction was deemed “too expensive” so there’s all this crap involving select() [5] and a baroque state machine abstraction in C++ to avoid “expensive threads” because we have to call multiple different databases (at the same time, because we have real tight deadlines) to deal with the request, never mind the increasing insanity of the “business logic.” What it does is easy to describe—given a phone number, look up a name and a reputation (requiring two different servers) based on the phone number. How it works is not easy to describe. Is the entire app glue? It could be argued either way really.

        [1] A simple “execute this command with redirection or pipes” was an individual project; the conditionals and loops bumped it up to a group project—I was the only one who did the “group project.”

        [2] My language was inspired by Forth.

        [3] http://www.inf.puc-rio.br/~roberto/lpeg/ It stands for “Lua Parser Expression Grammars”

        [4] It specified a particular rule from RFC-3966, but that rule included additional data that’s not part of the draft specification. It’s like someone was only partially aware of RFC-3966.

        [5] The concept, not the actual system call.

        1. 1

          Once a pipeline is composed, then what? If it does a bit too much, it’s hard to shave off what’s not needed. If you need to do something in the middle, it’s hard to modify.

          This is similar to my comparison to OOP. The only difference, I think, is that the difference between object interfaces usually introduces higher “impedance”. Again, it is not that the problem does not exist in Unix, it is only exacerbated in OO environments. And this is probably because of the evolutionary pressure I described in my original reply, etc, etc.

      2. 2

        It’s true that you sometimes have to massage the output of one program to match the input of another. So the glue isn’t “free”.

        However, I’d claim that the glue is often less laborious than you’d see in other non-Unix/non-shell contexts. A little bit of sed or awk can go a long way.

        In Unix, the glue seems linear; in some OOP codebases, the glue seems quadratic. I can’t find it now, but Steve Yegge has a memorable phrase that Java is like Legos where every piece is a different shape … while in Unix the Legos actually do fit.

        It does have disadvantages (which can be mitigated), but O(n) glue is a big difference from O(n^2).

      3. 15

        Didn’t amazon do this with The Decree:

        1. All teams will henceforth expose their data and functionality through service interfaces.
        2. Teams must communicate with each other through these interfaces.
        3. There will be no other form of interprocess communication allowed: no direct linking, no direct reads of another team’s data store, no shared-memory model, no back-doors whatsoever. The only communication allowed is via service interface calls over the network.
        4. It doesn’t matter what technology they use. HTTP, Corba, Pubsub, custom protocols — doesn’t matter.
        5. All service interfaces, without exception, must be designed from the ground up to be externalizable. That is to say, the team must plan and design to be able to expose the interface to developers in the outside world. No exceptions.
        6. Anyone who doesn’t do this will be fired.
        7. Thank you; have a nice day!
        1. 6

          Rereading this now, it’s not clear to me why a read of a data store is not a service interface. How do you draw the line?

          1. 3

            It’s a lot easier to version your API than it is to be stuck with a schema that’s now incorrect due to changing business logic.

            1. 4

              Only if the business logic uses the schema of the data store directly. But I’ve seen use cases where the business logic separates the internal schema from an external “materialized view” schema.

              And the quote above says, “it doesn’t matter what technology you use.” If you can use HTTP, well reading from an S3 bucket is HTTP. So this is one of those slippery cases where somebody can provide a “versioned API” that is utter crap and still not solve any of the problems that the decree seems to have been trying to solve.

              I guess what I’m saying is: Never under-estimate the ability of humans to follow the letter of the law while violating its spirit. The key here is some way to be deliberate about externally visible changes. Everything else is window dressing.

            2. 1

              The same way reading a field isn’t a method call.

              1. 2

                a) I think you mean “reading a field isn’t part of the interface”? Many languages support public fields, which are considered part of the interface.

                b) Creating a getter method and calling the public interface done is exactly the scenario I was thinking about when I wrote my comment. What does insisting on methods accomplish?

                c) Reading a value in an S3 bucket requires an HTTP call. Is it more like reading a field or a method call?

                1. 2

                  Maybe it is assumed that AWS/Amazon employees at the time were capable of understanding the nuances when provided with the vision. It is not too much of a stretch to rely on a mostly homogenous engineering culture when making such a plan.

            3. 4

              How so?

              They just decreed a certain architectural style, nothing about actually supporting this (and other styles!) by providing first class support.

              1. 4

                Thank you; have a nice day!

                OR ELSE

              2. 5

                I think I have an example of the difficulties of glue, in my own area of expertise. Most GUI toolkits that have ever been written are completely inaccessible to people who need assistive technologies, e.g. blind people with screen readers. To be accessible, one has to implement the accessibility APIs provided by various platforms, e.g. UI Automation for Windows, NSAccessibility for Cocoa, UIAccessibility for Cocoa Touch, and Android’s Java-based accessibility API. These accessibility APIs are glue – exposing the state of a UI in a platform-defined way so a screen reader or other tool can consume it. There are n cross-platform toolkits out there, and m platforms to support. So the work to make these toolkits fully accessible is n times m. Maybe not quite quadratic in the way you meant, but multiplicative anyway.

                The platform accessibility APIs aren’t necessarily straightforward to work with, particularly in a cross-platform project. UI Automation for Windows uses COM, the Apple APIs use Objective-C, and the Android one uses Java. So a developer wanting to implement all of these in a cross-platform toolkit or application likely has to deal with cross-language glue, to an extent that they otherwise wouldn’t have to. Sure, cross-platform toolkit developers have to do some work with each platform’s native language already, to open a window and handle keyboard and mouse events. But the data being exposed through an accessibility API is much richer – a tree of UI elements not unlike the HTML DOM. Marshaling that kind of data through an FFI is tedious and can also be inefficient.

                So I want to provide some kind of cross-platform abstraction for accessibility, to make it easier to implement accessibility in the long tail of GUI toolkits that don’t have big corporate backing. At first I thought I wanted to implement a cross-platform library, to be the SDL or GLFW of accessibility. But what language would I use? Working with a tree of objects in C is unpleasant. And for toolkits (or applications that roll their own UI) that aren’t written in C or C++ to begin with, I wouldn’t be helping them get away from the problem of language glue.

                The answer I’ve settled on is to define a schema for an accessibility tree, using a well-supported serialization format like Protocol Buffers (or more likely Cap’n Proto). Then I can implement libraries that consume this schema and implement the platform accessibility APIs: a C++ library for Windows, an Objective-C one for Mac, a Java one for Android, etc. On the other side, the cross-platform UI toolkits can produce the serialized data using their languages’ implementations of the serialization format. A little cross-language glue would still be needed, but only enough to call a handful of functions that take a byte buffer as their argument. So now the work to make n toolkits accessible on m platforms isn’t n times m, but much closer to n plus m.

                Still, the libraries for implementing the platform accessibility APIs will be pure glue, and it will be interesting to see how I can minimize the volume of that code. I think the article makes a good point about imperative programming languages not being a good fit for directly expressing glue.

                1. 4

                  I think we need to make glue first class so we can actually write down the glue itself, and not the algorithms that implement the glue.

                  What would that mean, in practice? More use of interface definition languages and formal methods to ensure that we can automatically ensure components are passing data in the correct format?

                  1. 2

                    Structuring data allows for less work in validating the input. Going the route PowerShell has done here where you can pass objects around and use the properties as parameters helps a lot. Especially as the receiving cmdlet can have both aliases for the properties, or even parameter sets to fit various input methods. There’s probably always going to be a lot of glue code, even with something like PowerShell, but at least you don’t have to write some complicated parsing rules to grab the 5th collumn and deal with erronous whitespace somewhere in the input data.

                  2. 3

                    Has anyone done a classification of glue code in the wild? Seems like that would be a good first step to making it explicit.

                    1. 2

                      Glue is quadratic. If you have N features that interact with each other, you have O(N²) pieces of glue to get them to talk to each other.

                      For him, the key difference is that Unix had the pipe, and I would agree.

                      Yes! That’s what I was getting at here with the “text as a narrow waist” comment:


                      There I was saying M formats and N operations, so it’s O(M*N).

                      And thanks for bringing up the “Unix and Google” video again. I will have to watch it again to see if there is a difference between the O(N^2) argument and the O(M*N) argument. There was a concrete example I gave in a Plan 9 thread about files, pipes and sockets, so that might help think about it.

                      BTW I have been trying to write a blog post about this architectural principle for about 6 months. I think the way I want to frame it is as the Perlis-Thompson Principle – e.g. with famous quotes from Alan Perlis and Ken Thompson which hint at this, but don’t fully explain it.

                      I will definitely refer to this post, as well as https://lukeplant.me.uk/blog/posts/everything-is-an-x-pattern/ (which appeared on lobsters), and a few others.

                      BTW I linked the “text as narrow waist” comment at the very bottom of my last blog post: Notes on the HotOS Unix Shell panel

                      1. 1

                        BTW I watched the linked Unix and Google video again:


                        It’s explicitly making the (M data types * N operations) Multics vs. Unix comparison (very similar to what I was saying in the link above). So I think the better way to describe it is O(M * N). I can sort of see where O(N^2) comes in, but it’s hard to make concrete, whereas O(M * N) is very concrete.

                        I would say the glue is O(M+N), i.e. linear, which is where the savings come in. For every structured data type, you “project” it on to text (or plain old data in Rich Hickey’s world, not typed objects). For every operation, you write a text version of it.

                        And I also agree that despite using Unix so much, the overall architecture of Google’s services (from both internal and external viewpoints) began to ignore the “Perlis-Thompson Principle”, and became too bloated for even 60K or 100K engineers to handle.

                        Just like in the video, the biggest companies in the world like AT&T and IBM with Multics can’t get around this fundamental math. Neither can Google.