1. 21

  2. 15

    I found the example poorly motivated, even though I somewhat agree with the sentiment. I think the author misses the key problem with DRY: It’s not a goal, it’s a tool to achieve a goal. The goal is code that is easy to maintain. There are two major reasons that repetition are bad:

    • The code may contain a bug. If it does, I need to find all of the repeated places and fix it and I’m likely to miss one.
    • The requirements will change and I need to be able to modify the code to meet the new ones. This is much harder if I have to make the same (or, worse, very small variants of the same) modifications in a load of places.

    The example doesn’t reflect this though. Stepping back, both pizzas are a set of default pizzas with some delta applied. If the store owner decides to rebrand and make deep pan the default, they need to change the same initialisation of crust in multiple places and may miss one. They can’t do a global replace because they don’t want to change places where the default is overridden by the customer, they need to go through all of the make_{kind}_pizza functions and understand each of the places.

    I am not particularly familiar with Python, so I’m not sure what the idiomatic way of doing it there is. In a class-based language such as C++, I’d make the default constructor initialise all of these fields and then have the make_{kind}_pizza functions apply the deltas. In a prototype-based language such as JavaScript, I’d have a pizza prototype that was initialised with all of these.

    That’s not my preferred solution though because it assumes that there is a global default. If I decide I want to cater to folks with dietary restrictions then I might want to allow people to specify gluten or lactose intolerance (or other preferences), so the default pizza might have a different kind of crust or different cheese based on some context. This suggests that the caller of the make functions should be passing in a default pizza and I’d have a single point where the default pizza was created. If I want to add vegan cheese later (for example) then I’d add a new constructor / factory method for the pizza object for creating a default vegan pizza. Similarly with a gluten-free base.

    If I discover that crust isn’t actually a single value and need it to be a multidimensional thing (different sizes, stuffed, gluten free) with only a subset of points in this space actually permitted, then I have a single place to modify the default base and I can then update the code that lets the user select a non-default base in another single place.

    If none of the ever happen, then I still have less code and so when I discover that I can’t type and made a typo in one of the default settings that I’ve copied and pasted all over the place then I have less code to modify to fix it and a much higher confidence that I’ve actually fixed it.

    1. 6

      The requirements will change and I need to be able to modify the code to meet the new ones. This is much harder if I have to make the same (or, worse, very small variants of the same) modifications in a load of places.

      Sure, and it’s even more work to split up code that’s been de-duplicated too soon, before an understanding of how requirements might change in the future is reached. My personal rule is to duplicate until I’ve had to make similar repetitive change several times; those repetitions guide the refactor. I rarely stand a chance of knowing a priori what changes will need to be made in the future.

      Over and over I come back to the lessons of this post. “Repeat yourself to find abstractions.” https://programmingisterrible.com/post/176657481103/repeat-yourself-do-more-than-one-thing-and

      1. 3

        I like to drop a TODO comment in the code as soon as I notice duplication or near-duplication, but then stew on them for a bit before doing anything about it (unless it’s obvious from the get-go). It lets me ~capture the observation that there might be a nascent abstraction while the knowledge is fresh, but then leaves the ultimate call up to N future versions of myself. Sooner or later, one of them will know better than I.

      2. 5

        I agree that the example code is poor for the reasons you say.

        That said, I have seen exactly these problems in production codebases constantly. The tendency towards premature refactoring of incidental similarity is very real.

        Here’s a much better expression of the same idea: The Wrong Abstraction by Sandi Metz

        1. 1

          Yeah, I agree with all of this.

          At the end of the day, we just need to get work done in a way that doesn’t create too much more work in the future. DRY is a tool towards this end, but not the only tool, and not a tool that needs to be used every time.

        2. 7

          My favorite quote from Rob Pike is “A little copying is better than a little dependency.” https://go-proverbs.github.io/

          1. 4

            I’d also like to add that while I agree with the sentiment, I think the make_pizza example isn’t a particularly good one. DRYing this up would mean recognizing that make_pizza is about the structure of a pizza, i.e. the keys in your dictionary/hashmap/whatever you want to call it for the purpose of generalization. A pizza is a thing with

            • Crust
            • Sauce
            • Cheese
            • Toppings

            To make the DRY argument simply about toppings is a bit of a straw man argument. If one were to actually DRY it up, one might start by realizing make_pizza is responsible for defining the structure of a pie and nothing else.

            Your DRY pie making looks like

            def make_pizza(crust, sauce, cheese, toppings):
                payload = {
                    crust: crust,
                    sauce: sauce,
                    cheese: cheese,
                    toppings: toppings
                requests.post(PIZZA_URL, payload)

            Now it’s been DRYed up, and usefully.

            1. 2

              My own rules of thumb for DRY are cross-module uses and pattern length, not necessarily in that order.

              You might have lots of repetition within a module’s implementation due to the fact that you are constantly interacting with the same objects and data structures. In such cases, context might be narrow enough such that no further levels of encapsulation are required.

              You might have an auxiliary function here and there, but I avoid to jump the gun in such cases. And this is where I start to consider pattern lengths. Roughly speaking, I consider repetition_density * number_of_statements. That means I convert those in auxiliary functions if the pattern is short but repeats very often, or either if it is long, or if it has high cyclomatic complexity, but occurs only a few times.

              1. 1

                I think there’s even more facets to the points made.

                1. DRY is misused to eliminate coincidental repetition

                Yes, but I also blame (bad) static analyzers. It feels good to get perfect scores and it’s good that people think about improving their code quality, but I’ve seen way to many cases of people “fixing” their code, according to what some kind of linter suggests only to make it harder to read or maintain. Duplicate code detection is one such example.

                Another example I can think of is people ending up creating huge objects to tell a function what to do instead of making two, effectively leading to spaghetti code or inventing a pseudo-language, when the object to pass in gets really complex. I’ve also seen people do that with REST-APIs (in the sense of JSON with HTTP) where an endpoint has dozens of parameters, instead of having two endpoints that have clearly defined purposes. But such things often happen when you tell five engineers to discuss some simple endpoint. Everyone ends up with something cool one could add and before long your search endpoint is turing complete.

                1. DRY creates a presumption of reusability

                That’s a general theme of many fallacies when programming. So much code was written to be generic only to either never used again or to not be generic in the way needed, sometimes causing a rewrite, other times causing clear code to turn into an abomination because some hack gets attached to bend it in a way to fit the new requirement. And if you keep doing that it often will end up complex and unmaintainable, the opposite of why you want to have DRY code.

                1. 1

                  This reminds me of https://solid-is-not-solid.com/ and my own post long ago https://www.soulcutter.com/articles/local-variable-aversion-antipattern.html which is to say - some dogmas get taken to an extreme rather than being used judiciously. Why? It takes years to develop a sense of when techniques are useful and when they may be harmful. It’s easy to defend the dogma because it has been written and repeated for so long despite existing outside of the context of your code. It’s tiring to defend deviations because it involves predictions about the future and how things will evolve.

                  All in all, the wrong abstraction can be worse than repetition, and I can tolerate a fair amount of duplication in service of uncovering when consolidation may truly be useful.

                  These sorts of posts feel evergreen to me as developers mature and discover that these principles may not be as ironclad as they were taught coming up.

                  1. 6

                    When people want to become better writers, they often find books and guides that give advice such as “keep your sentences short”, “omit needless words”, “use the oxford comma”, “avoid the passive voice”, etc. This type of advice is text-based, i.e., it says what the text ought to look like. Prof Larry McEnerney from the University of Chicago vigorously rejects that good writing is about following rules (https://youtu.be/aFwVf5a3pZM, https://youtu.be/vtIzMaLkCaM): he argues that good writing is about the reader, the function of the text, and making the text be valuable_ to readers.

                    I think programmers have a similar problem. We are given a lot of advice about what the code ought to look like: “Use meaningful variable names”, “keep methods under 10 lines”, “don’t repeat yourself”, etc. But that doesn’t help make code better if the code is solving the wrong problem, if it’s attacking the problem with the wrong approach, if it’s code that should not even exist, etc. It’s just very easy to learn and memorize a bunch of rules about what the source code should look like and use only those criteria to judge the quality of a program.

                    1. 2

                      When people want to become better writers, they often find books and guides that give advice such as “keep your sentences short”, “omit needless words”, “use the oxford comma”, “avoid the passive voice”, etc.

                      Two books on writing that do better:

                      • Clear and Simple as the Truth by Francis-Noël Thomas and Mark Turner
                      • Style: Toward Clarity and Grace by Joseph M. Williams (Even though the original author has passed away, the current publisher updates the book frequently and artificially sets the price very high for textbook sales. You can still find many used copies from the 90s, even in hardcover, at good prices. E.g., on ABE Books. You can also find a PDF of an out-of-print version here, for example.)

                      Thanks for recommending the videos. I look forward to watching them.

                      1. 2

                        Thank you for your recommendations. I’m reading the PDF of the out-of-print version of Style: Toward Clarity and Grace and I think the content is fantastic. I looked up the author on Wikipedia and I realized that he created the writing program at the University of Chicago, along with Larry McEnerney of the videos I linked!

                        1. 1

                          I looked up the author on Wikipedia and I realized that he created the writing program at the University of Chicago, along with Larry McEnerney of the videos I linked!

                          Ah, very cool. I knew that Williams had a University of Chicago connection, but not that he helped to create their whole writing program. (With all that in mind, I cannot figure out why or how the University of Chicago Press lost the rights to his book. It’s a shame that Prentice Hall got it since they overcharge so much for current editions. They also seem to force new editions frequently to prevent students from buying used editions.)

                  2. -1

                    sadly enough the author messed javascript object notation (it lack of quote in key of dictionnary) with otherwise syntacly correct python code