1. 12
  1.  

  2. 17

    I disagree with R6. Specifically, the advice of moving sub-blocks to separate function in the name of avoiding too much nesting. My problem here is that the nesting is still there. It’s just hidden through the function call.

    I’ll concede that in some cases this makes sense. Especially when the resulting sub-function ends up performing a well defined task and needs few arguments. But it’s not what we should go to first. What we should try instead is to flatten the scopes.

    In practice, the main sources of deeply nested scopes are loops and conditionals. (Like, duh.) And while nested loops can rarely be collapsed, I’ve seen many conditionals that could be flattened. A classic example is the following pyramid, that we often see when we applying the brainded “single exit point” fake quality rule:

    if (test1) {
        if (test2) {
            if (test3) {
                OK();
            } else { error3(); }
        } else { error2(); }
    } else { error1(); }
    

    It can generally be flattened to:

    if (!test1) { error1(); return; }
    if (!test2) { error2(); return; }
    if (!test3) { error2(); return; }
    OK()
    

    (When your language doesn’t have destructors or defer it can be a bit more complex. Worst case, you’ll need to use goto to handle the cleanup without duplicating code all over the place.)

    Only when everything is nice and flattened can we ask ourselves whether the nesting is still too deep.

    1. 6

      And while nested loops can rarely be collapsed

      But often extracted. An extraordinary number of the comments I make while reviewing other people’s code is pointing out to them where they can do a tiny amount of preprocessing to avoid unnecessary nested loops and ways they can chain functions that accept and return iterators to replace nested loops with what are in effect pipelines, while making their code simpler.

      1. 3

        It can potentially offer more readable code.

        hasPineapple *> isFresh *> isNotOnPizza
        

        This can’t happen if you inline the conditions and errors. Hiding the specifics is desirable if it’s behind a good function name. Contrast:

        do
          when (foo1 `bar1` baz1) $ Left "tastes dull"
          when (foo2 `bar2` baz2) $ Left "could be fresher"
          when (foo3 `bar3` baz3) $ Left "heathen"
        
        1. 5

          This is a good example of the tension between naming things and keeping the code local. I’d say it depends what the emphasis should be on. Your 3 function chain at the beginning is perfect for showing the big picture at a glance. The expanded version below however is better at showing the internals. The question is, which is more important in any given case?

          1. 3

            Big picture, but possibly only when the language gives you the tools to do so comfortably.

            A delightful feature of Haskell that’s not made it to any other language I use is where clauses, where you can have local definitions after the function body. It’s essentially an inverted let..in.

            f x = hasPineapple *> isFresh *> isNotOnPizza
              where hasPineapple = etc
                    isFresh = etc
                    etc
            

            This allows you to define these bindings at the scope of the outer function, and also hide them below the main function body. In use it acts as if to say “read the main body first, and you can drill down to any of the specifics if/when you care about them”.

            On the other hand in a language like TypeScript you need to define these bindings above the main body. (You could make an exception for hoisted functions but the scope is still unclear and not all bindings will be functions.)

          2. 3

            hasPineapple *> isFresh *> isNotOnPizza

            I can’t parse this. What is it meant to express?

            1. 1

              It’s Haskell’s applicative functor syntax. In fp-ts it’d be:

              pipe(hasPineapple, apSecond(isFresh), apSecond(isNotOnPizza))
              

              It’s similar to monadic bind (which is sort of found in other languages such as JavaScript’s Promise.then and Rust’s Result.and_then). The difference is that here, with the weaker applicative dependency, the code needn’t run sequentially, and we don’t care about the result on one side of the operator assuming it succeeds.

              If we imagine their types are Either e a, then this will either give us back isNotOnPizza‘s a (which we don’t really care about in this example), or the left-most e, representing failure. Here are some REPL-friendly examples:

              -- Right 'y'
              Right 'x' *> Right 'y'
              
              -- Left 'e'
              Left 'e' *> Right 'y'
              
              -- Left 'e'
              Right 'x' *> Left 'e'
              

              If the types were Validation, then the same code would result in us collecting all the errors in say a list instead of failing fast.

              Applicative syntax is also extremely pleasant for writing recursive descent parsers.

              1. 1

                I’m sorry, I still can’t parse the actual intent here. When I read that expression I see two boolean qualifiers (hasPineapple, isFresh) which make sense to apply to a given value (a pizza). But then there’s this parameterized qualifier (isNotOnPizza) without any apparent parameter. Is what not on pizza?

            2. 1

              but for chains like this you need languages that supports Result-like types well

              1. 2

                It’s becoming more common. Rust’s “enums” come to mind. I’m positive there are others in more mainstream, non-functional languages.

                1. 3

                  Wider use of algebraic data types can’t come soon enough.

            3. 1

              And while nested loops can rarely be collapsed, I’ve seen many conditionals that could be flattened.

              I tend to think of error handling as a special case, where the ideas of structured programming must be suspended. Errors are exceptional states that usually require execution to be aborted, which is why they are often handled using non-structured jumps, like exceptions or gotos.

              Outside of error handling, I am not sure if early returns are a good idea in general. Compare:

              void UpdateTheme()
              {
                  if (!IsThemeActive) return;
                  const bool bThemeActive = IsThemeActive();
                  g_dlv->UpdateTheme(bThemeActive);
                  g_elv->UpdateTheme(bThemeActive);
              }
              

              and

              void UpdateTheme()
              {
                  if (IsThemeActive) {
                      const bool bThemeActive = IsThemeActive();
                      g_dlv->UpdateTheme(bThemeActive);
                      g_elv->UpdateTheme(bThemeActive);
                  }
              }
              

              I recently changed the former to the latter, because I think the intent and logical structure are more transparent.

              1. 5

                In your specific example I think they the before and after are equivalently clear, but in the general case early returns for “exceptional” cases are more readable.

                Align the happy path to the left edge

                The nesting forces you keep more context, and more distant context, in your short term memory.

                While technically speaking you could say the same of early returns (all the previous early returns are context, after all), in practice this isn’t really case, because the early return style allows you take advantage of the knowledge “ok, there’s a bunch of crap that could go wrong, and we’re dealing with each of them, and now it’s all done and we can handle the main case” – which is an easier way to think about things.

                It’s also nice to have it highlighted that way in code with a consistent structure. In the nested version, your “main logic”, the happy path, is this deeply indented little branch in the middle of a bunch of other code or, even worse, split up across multiple such branches.

                1. 1

                  in the general case early returns for “exceptional” cases are more readable.

                  I agree. I suppose I should say that in my specific example, it isn’t really an error or an exceptional case that IsThemeActive is null. That it is non-null is simply a condition for the following three lines of code to be executed. The benefit of using a conditional statement rather than an early return is that it is harder for the reader to miss the condition, and it is easier to refactor the code, e.g. if one were to move the block into another function.

                  My general point is that early returns and gotos are great when they’re needed, but that I’m not sure whether it is a good idea to use them when a normal conditional statement would fit just as well. If the only reason an early return is used is to avoid another level of indentation, then perhaps it actually makes the code less clear and harder to understand. Indentation highlights the logical structure of the code visually in a way that non-structured control flow tends to obscure.

                  1. 1

                    The thing is that indentation is a pretty good proxy for state, and state is absolutely worth minimizing. It’s not an exceptional case that IsThemeActive is false, but if the majority of the logic of a block of code assumes that IsThemeActive is true, then it’s good if you can eliminate the counterfactual condition early-on, and therefore be able to drop that state going forward.

                    And I don’t think early returns and gotos are really comparable. Early returns work within the rules of the call stack, but gotos can do basically whatever.

                2. 5

                  Errors are exceptional states that usually require execution to be aborted, which is why they are often handled using non-structured jumps, like exceptions or gotos.

                  Is it exceptional if an HTTP GET request fails? Or a disk read? Or a JSON parse of untrusted data? Or an execution of a subprocess? Or a function call that takes user input? Assuming you’re programming with syscalls – errors are absolutely normal, equivalent to happy-path control flow, and should be returned to callers same as a successful result.

                  UpdateTheme

                  YMMV, but I wouldn’t approve a PR that changed your first version to the second. Early returns are God-sends! They’re one of the best ways to reduce the amount of state that human beings need to maintain as they read and model a block of code.

                  1. 2

                    errors are absolutely normal, equivalent to happy-path control flow

                    The difference is that errors usually require the entire execution to be aborted early, and they may arise in different parts of the logical structure of a function. Exceptions are so useful because they cater to this specific, but very common need. Early returns are another way of handling it.

                    Early returns are God-sends! They’re one of the best ways to reduce the amount of state that human beings need to maintain as they read and model a block of code.

                    On the one hand, yes. Outside of error handling, I find early returns to make the code more clear when the execution of an entire function depends on a variety of preconditions.

                    On the other hand, some problems with early returns are that

                    1. they’re easy to miss, and
                    2. they require the reader to carefully read all the code in order to understand the logical structure and control flow.

                    I think both points apply to the code sample I shared above. In the second example, it is clear visually that the execution of the indented lines depend on some condition. One doesn’t even really have to read the code. It is also easier to refactor.

                    There is a presentation about structured programming by Kevlin Henney, which influenced some of my thoughts on this.

                    1. 1

                      The difference is that errors usually require the entire execution to be aborted early, and they may arise in different parts of the logical structure of a function. Exceptions are so useful because they cater to this specific, but very common need.

                      I bet we work in pretty different programming contexts, and I’m sure that rule makes sense in yours. But in my world, there is essentially no situation where an e.g. network request error, or input validation error, or whatever else, should abort execution at any scope beyond the current callstack.

                      On the other hand, some problems with early returns are they’re easy to miss, and they require the reader to carefully read all the code in order to understand the logical structure and control flow.

                      Early returns model error handling as normal control flow. If errors are normal, and an error is no different than any other value, then readers will naturally need to do these things to understand some code. They have to do that anyway! And exceptions actually make understanding control flow harder rather than easier, I think. With early returns you don’t get any surprises — the function will return when it says return. But with exceptions, all bets are off — the function can return at any expression.

              2. 2

                may I add immutability (vulgo: const)?

                1. 2

                  A function with only a single caller is usually a code smell. If you have some complex-ish code that could be a function but is only used in one place, I find that inline lexical scoping is often a better approach.

                  // From this
                  a = foo(...)
                  error handling
                  b, c = bar(a, ...)
                  error handling
                  useful_value = baz(b, c, ...)
                  error handling
                  
                  // To this
                  var useful_value
                  {
                    a = foo(...)
                    error handling
                    b, c = bar(a, ...)
                    error handling
                    v = baz(b, c, ...)
                    error handling
                    useful_value = v
                  }
                  
                  1. 2

                    By encapsulating a bunch of lines of code inside a properly named function. You give the option to whomever is reading the code to not be bothered with details below the current abstraction level. Even if such function is only used there.

                    Often times the contents of such function are trivial to guess, if the code is well written. The original point in the code where the function is called will then consist of a descriptive function name and the reader can continue reading the core after a less than a second.

                    1. 2

                      IME it’s just as easy to skip over a lexically scoped block as it is a function call, but if you do care about the implementation, the block is much easier to grok. Straight-line reading of logic, whenever possible, is crucial to coherence.

                      edit: single-use functions also, IME, tend to require an enormous amount of input state, which normally has to be represented in the signature. Lexical scoping blocks let you get out of that pain, too!

                  2. 1

                    I follow these principles as they are, IMO, logical. Just simple well motivated principles with a well defined purpose. Nothing complicated or non trivial to grasp.

                    Sadly, every single time, comes some developer with half-assed rebuttals revolving around how he is smarter and knows better and will do even better. It always results in spaghetti code in a mater or months, sometimes weeks.