1. 54
  1.  

    1. 8

      Thanks for the article. Maybe I’m confused, but why in the section near the end about how the two recommendations go together, why is the code this:

      if condition {
        for walrus in walruses {
          walrus.frobnicate()
        }
      } else {
        for walrus in walruses {
          walrus.transmogrify()
        }
      }
      

      and not this?

      if condition {
          frobnicate_batch(walruses)
      } else {
          transmogrify_batch(walruses)
      }
      
      1. 8

        The idea isn’t “name your functions xxx_batch and pass arrays around”, it’s rather that the syntactic heuristic about moving if conditions (branching forwards, skipping code) up and for loops (branching backwards, re-executing the code) down has somewhat surprising connection to larger, more structural (and sometimes even architectural) aspects of code.

        In the example with nested ifs and fors the advice is applied literally, and it is observed that, in the small, it might get vectorization for free and, in the large, it can describe an architecture of a distributed database.

        1. 9

          That’s the part I instinctively disagreed with. In my experience, this:

          if condition {
            for walrus in walruses {
              walrus.frobnicate()
            }
          } else {
            for walrus in walruses {
              walrus.transmogrify()
            }
          }
          

          Will grow, spread, and now you’ll have this, everywhere:

          if condition {
            for walrus in walruses {
              <30 lines of code frobnicating walrus>
            }
          } else {
            for walrus in walruses {
              <40 lines of code transmogrifing walrus>
            }
          }
          

          And the two blocks are 90% the same, but the 10% difference is sprinkled all over the place.

          Also worth noticing, depending on the language, nature of the conditional, and nature of the processing, the performance different is beyond irrelevant.

          Maybe I’m too “DRY-brained” here, but my gut feeling would be to push the loop somewhere, kinda like a map, maybe. And I wouldn’t even consider pushing the if out of the for to begin with, unless the conditional is really impacting performance.

          1. 2

            Agreed with parent. A reason to push loops down is that they are more likely to correspond to sensible operations on the entire set. The perf benefits come from that.

            Closely related is Sean Parent’s “No Raw Loops” rule: https://www.youtube.com/watch?v=W2tWOdzgXHA

          2. 1

            Wouldn’t that be solved if you had functions to frobnicate and transmogrify walruses and called those from the loops? Then you can refactor as needed. Think of the fors as list iters/maps/folds with the behaviour passed in as a function.

            1. 1

              It’s not impossible to follow the principle and still have decent code.

              But I think I would have a harder time explaining how, particularly to less experienced developers. Which is why I’m not so sure it’s a very good principle/rule of thumb.

              1. 6
                action = condition ? Walrus::frobnicate : Walrus::transmogrify
                
                for walrus in walruses {
                  action(walrus)
                }
                
                1. 1

                  I like it, but you’d be surprised how big of a leap that can be to some people.

      2. 3

        i think the second one is often the right shape, but it’s the second step in the refactor:

        • step 1: put the for loops inside the if/else
        • step 2: put the for loops into functions

        you can combine the two steps into a single commit. but, step 1 can be an improvement on its own, and you can do step 2 as the loops get uglier, or as the nesting gets deeper. if you do step 2 without doing step 1, i think you can get code that’s harder to refactor/optimize

    2. 6

      I’m not sure about the fors, because that’s the internal iteration vs external iteration tradeoff. If you push fors down, you can’t “zip” two iterations into one. This should probably be decided on case-by-case basis if there’s something special to do before/after the loop (e.g. locking a lock once rather than per iteration).

      With ifs, I’m also wondering about cases where the else case needs to be handled frequently, and in a consistent way. If you have a Walrus without a name, you wouldn’t want else "Anon", else "(null)", else "???" ad-hoc fallbacks all over the place.

      1. 2

        Automatic inlining can also nullify the point about reducing the number of calls to the child function. I’d consider it simpler and more composable to locate the loop outside the child function—in many modern languages, a collection function ought to do the looping for you. Consider pushing the loop down when profiling identifies a slow spot.

    3. 6

      I’m reminded of a favorite article with a similar, though orthogonal, directional metaphor: Type Safety Back and Forth (on Lobsters). It even uses the same example of a Maybe/Option type.

      1. 2

        And another most excellent link of vaguely Haskell color from HN:

        https://www.haskellforall.com/2020/07/the-golden-rule-of-software-quality.html

    4. 6

      I like both pieces of advice.

      It is the volume of entities that makes the path hot in the first place. So it often is prudent to introduce a concept of a “batch” of objects, and make operations on batches the base case, with a scalar version being a special case of a batched ones:

      As a fan of APL-likes, it struck me that this principle is the bedrock of the array paradigm. So much so that it’s built into the syntax, rather than being an idiom or style: 1 + 8 9 10, etc. Everything works as a batch by default.

    5. 2

      In 90% of the cases, the following code isn’t “good” as in “natural”:

      // GOOD
      if condition {
        for walrus in walruses {
          walrus.frobnicate()
        }
      } else {
        for walrus in walruses {
          walrus.transmogrify()
        }
      }
      

      Most of the time the idea here is to choose some strategy or method based on some criteria. E.g. the user chooses a compression algorithm and then we use it to compress multiple files. In those cases, the code should instead be written like this:

      let transform = if condition { walrus.frobnicate } else { walrus.transmogrify }
      
      for walrus in walruses {
          transform(walrus)
      }
      

      Or even better:

      let transform = condition ? walrus.frobnicate : walrus.transmogrify
      
      walruses.foreach( transform )
      

      That adds a layer of indirection, but maps very well to how we think about the problem at hand. And for the same reason it is easy to extend when the logic of picking the strategy/transformation gets more complicated. Ultimately it is an example of separation of concerns.