1. 25

  2. 8

    No “generic” library or framework I’ve seen ever has been able to deliver 100% re-usability. Even string libraries aren’t entirely reusable; for example, constant-time comparison is required in many security applications, but non-security applications tend to favour raw speed. Of course you could add a flag to make it more generic and re-usable.

    If you keep adding flags like this for components that are large enough you end up with so many flags for each different kind of sub-behaviour (or so many versions variants of the same component) that it becomes unwieldy to use, maintain and performance will suffer too.

    That’s why “use the right tool for the job” is still great advice, and so is Fred Brooks’ old advice to “build one to throw away, you will anyway” when building something new.

    1. 4

      I’ve always worked toward the “guideline” that an abstraction should shoot to cover 80% of the problem, but should be very easy to “punch through” or “escape” for that last 20%

      If possible, I won’t “add a flag” to support a feature, but will instead try to write the library in a way that allows it to be disabled or skipped when needed. Suddenly the worry of the “perfect abstraction” goes away, and you are left with a library that handles most cases perfectly, and allows another lib or custom code to take over when needed.

      1. 3

        That’s a very good approach. I also like the opposite approach, which is the “100% solution” to a narrowly (but clearly!) defined problem. The Scheme SRE notation is an example of this, as is the BPF packet filter virtual machine.

        This allows you to make a tradeoff to choose whether a tool fits your needs.

        1. 1

          I’ve always worked toward the “guideline” that an abstraction should shoot to cover 80% of the problem, but should be very easy to “punch through” or “escape” for that last 20%

          I always liked Python’s convention of exposing everything (almost; I’m not sure if the environment of a closure is easily exposed), and using underscores to indicate when something should be considered “private”.

          I emulate this in Haskell by writing everything in a Foo.Internal module, then having the actual Foo module only export the “public API”.

        2. 1

          This seems like something that should be solved outside the library that deals with string manipulation. For example, in Clojure I’d write a macro that ensured that its body evaluated in a constant time. A naive example might look like:

          (defmacro constant-time [name interval args & body]
            `(defn ~name ~args
               (let [t# (.getTime (java.util.Date.))
                     result# (do ~@body)]
                 (Thread/sleep (- ~interval (- (.getTime (java.util.Date.)) t#)))

          with that I could define a function that would evaluate the body and sleep for the remainder of the interval using it:

          (constant-time compare 1000 [& args]
             (apply = args))

          I think that decoupling concerns and creating composable building blocks is key to having reusable code. You end up with lots of Lego blocks that you can put together in different ways to solve problems.

          1. 6

            To me that smells like a brittle hack. On one hand you might end up overestimating the time it will take, thus being slower than necessary, or you could underestimate it, which means you’d still have the vulnerability.

            Also, if the process or system load can be observed at a high enough granularity, it might be easy to distinguish between the time it spends actually comparing and sleeping.

            1. 1

              I specifically noted that this is a naive example. This is my whole point though, you don’t know what the specific requirements might be for a particular situation. A library that deals with string manipulation should not be making any assumptions about timing. It’s much better to have a separate library that deals with providing constant timing and wrapping the string manipulation code using it.

              1. 6

                Except in this case constant time is much more restrictive than wall-clock time. It’s actually important to touch the same number of bits and cache lines – you truly can’t do that by just adding another layer on top; it needs to be integral.

                1. 1

                  In an extreme case like this I have to agree. However, I don’t think this is representative of the general case. Majority of the time it is possible to split up concerns, and you should do that if you’re able.

                  1. 5

                    But that’s the thing. The temptation of having full generality “just around the corner” is exactly the kind of lure that draws people in (“just one more flag, we’re really almost there!”) and causes them to end up with a total mess on their hands. And this was just using a trivial text-book example you could give any freshman!

                    I have a hunch that this is also the same thing that makes ORMs so alluring. Everybody thinks they can beat the impedence mismatch, but in truth nobody can.

                    I guess the only way to truly drive this home is when you implement some frameworks yourself and hit your head against the wall a few times when you truly need to stretch the limitations of the given framework you wrote.

                    1. 2

                      My whole argument is that you shouldn’t make things in monolithic fashion though. Instead of doing the one more flag thing, separate concerns where possible and create composable components.

                      Incidentally, that’s pretty much how entire Clojure ecosystem works. Everything is based around small focused libraries that solve a specific problem. I also happen to maintain a micro-framwork for Clojure. The approach I take there is to make wiring explicit and let the user manage it the way that makes sense for their project.

                      1. 3

                        Monolithic or not, code re-use is certainly a factor in the “software bloat” that everyone complains about. Software is getting larger (in bytes) and slower all around – I claim a huge portion of this is the power of abstraction and re-using components. It just isn’t possible to take the one tiny piece you care about: pull a thread long enough and almost everything comes with it.

                        Note that I’m not really making a value judgement here, just saying there are high costs to writing everything as generically as possible.

                        1. 1

                          You definitely have a point here, on my first job I was tasked to implement a feature, this was a legacy WinAPI app. It involved downloading some data via HTTP (iirc it downloaded information on available updates for the program). Anyways, I was young and inexperienced, especially on the windows platform. The software was pure C mainly, but a few extensions had been coded in C++.

                          So when I wrote my download code, I just used STL iostream for the convenience of the stream operators. Thing is, I was the first C++ code in the code base to use a template library, all the other C++ code was template-free C-with-classes style. The size of the binary doubled for a tiny feature.

                          I rewrote the piece in C, and and the results were as expected, no significant change in size for the EXE. Looking back it makes me shudder what I was tasked to implement and what I implemented. However, I am also not happy with the slimmed down version of my code.

                          Nowadays the STL is just not a big culprit anymore, when you look at deployment strategies that deploy statically-linked go microservices within fat docker images onto some host.

            2. 1

              That constant time comparison doesn’t work, because you can still measure throughput. Send enough requests that you’re CPU bound, and you can see how far above the sleep time your average goes.

          2. 8

            Reuse works, but if it does it remains invisible and non-obvious. In the big: always write an OS, compiler, I/O routines, such as graphics or networking, from scratch? No? These “softwares” are very suitable to reuse. In the small: always write a custom data structure, and manipulation routines such as parsing, sorting, printing, for each different kind of object? No? The tiny objects that are sufficiently encapsulated can be treated generically, thus enabling reuse.

            1. 1

              I began programming in 90s, when OO hype was at it’s highest, and so I definitely feel the apparent failure of reuse strongly even now.

              Of course, one part of reuse is the “glass half-empty/glass half-full” effect. Of course code reuse happens but of course the failure of reuse also happens. The key problem is describing the ways that this failure happens, I think.

              I’d divide our apparent failures into two parts.

              A. The failure of “effortless” reuse. OO originally talked about objects being created during the ordinary process of coding, as if the problems of engineering libraries could be ignored. This fantasy thankfully is mostly done. However, this is also the less fundamental part of the failure of the idea of reuse, since there’s a solution - just make or use object-libraries or just libraries (whether OO, DSL or procedural approaches work better here is a secondary question imo).

              B. The less-than-complete-success of any encapsulation effort. The failure of OO, procedural or other library in terms of the failure of these to fully hide their implementation details when they used frequently in a large-ish application. This isn’t saying libraries, operating systems, graphic drivers and high level language are useless. The problem is all the abstractions wind-up “leaky” on some level and so when the programmer is programming and using, say, 10 abstraction layers, the programmer is going to be forced to consider all ten layers at one point or another (though naturally some will be worse than others, some only in terms of performance but that’s still consideration imo). The lpad event that broke the web a bit ago is one extreme example of this sort of problem.

              So “B” is bigger problem. It seems to limit how many layers of abstraction can be stacked together. I don’t know of any studies that directly tackle the question on these terms and I would be interested if anyone has links here.

              1. 1

                I sometimes think we are sometimes just really bad at memorizing advice and passing it on accurately. While OOP was advocated for with the “reuse” argument, there was also a “use before reuse” principle. But these subtleties just seem to get lost, when people start writing up syllabuses and introductory material.

                I learned programming mostly with tutorials and books that were written in the 90ies when OOP craze was in full bloom (and that material spent a lot of time explaining “OOP principles”, much more I think than a modern book on Java/C++/Python does). Anyway, I often did not find OOP helpful a lot for structuring my code, finding good OO models was hard and hardly seemed worth it.

                Fast forward this month. I borrowed a book on DDD patterns and started tinkering with the patterns outlined there and I must say for the first time in my life I have the feeling that I have a reasonable strategy in front of me for mapping business logic to classes/objects. And differs a lot from the naïve examples that I just recently saw in a mandatory corporate training.

                Who knows when functional programming will reach this point where the original motivations are already buried so deep that they cannot be seen anymore.

                1. 3

                  Having invested a lot of time into DDD over the last eight years, I personally think a lot of it isn’t as valuable as it initially seems. The various materials ultimately describe new names for things you probably already have names for, but this author wants you to call something else.

                  While neat to read, I’d caution against trying to go through your codebases renaming everything, which anecdotally has been what new readers first do. That often is a big time sink that doesn’t have any payoff at all other than just new names for old concepts.

                  However, reading about CQRS (an often referenced outgrowth of DDD) is a big deal, and would cause you to actually structure everything in a different way that can add potential benefit (immutability, fast read operations, etc). I highly recommend at least reading up on that.

              2. 1

                Currently, microservices are the paradigm de jour, and – of course again – re-use is the business case promise that accompanies it regularly.

                Is it? I don’t think I’ve heard a re-use argument for microservices. APIs often get used by multiple clients, but that’s equally likely to be the case in a monolith, so I’m not sure why someone would argue that moving to microservices would make things more re-usable. Unless it involves changing how the API works to make it suitable for use by a wider range of clients, but that’s orthogonal to microservices.

                Some arguments that I have heard for microservices are: small focused teams / ownership, separation of concerns, fault tolerance, allowing for diversity in language/db choice, reduced resource use / better capacity allocation, and rapid deployment. I don’t recall re-use, but maybe I’m wrong.

                1. 1

                  Nice article. It’s almost as if things like reuse only make sense if you have augmented development where the codebase is entirely within its scoped repositories. Then the object abstractions can be unified across this scope, with commonality dropping down back to the base objects, with the high layers becoming populated with uniquely used code hoisted into its lesser frequented objects.

                  As if such metaphors are beyond the reasonable care of mere mortal programmers, but only automated ones with the diligence to constantly adapt the object model as the codebase changes.

                  This might be seen as insulting to most programmers.

                  1. 1

                    Interesting that the author does not refer to Robert L. Glass’ ‘Facts and Fallacies of Software Engineering’, which expands on reuse in facts 16 through 20. The article would’ve been better for it, as he does seem to add an interesting observation to those facts: ‘premature design for reuse tends to achieve the opposite of its goals’.

                    1. 1

                      Ah, the good old broken record of “re-use doesn’t work because we write shitty code that nobody can re-use because the layering and the abstraction and the licensing and the ownership and the coupling is all wrong and I can’t be stuffed to ever clean it up”.