1. 22

My professor thinks that self-commenting code is more fundamentally important than commenting itself, as people tend to avoid reading comments and read the actual code instead. I wanted to ask this question because I think it’s important to get an opinion from a general audience about this thought.

  1.  

  2. 27

    Personally, I strive to write code in a way that makes the “what” obvious, then explain the “why” that can’t easily be encoded in the code itself.

    In my experience this approach is quite maintainable since the “why” is usually relatively static (the comments don’t need to change often and so don’t fall out of date), and the “what” should be obvious from the code itself, which changes frequently but is kept up to date by being clear in the first place.

    1. 10

      In my experience this approach is quite maintainable

      Which speaks to the real issue – comment rot. After my first 5 years as a contractor, I stopped reading “what” comments entirely – it was pointless, they were bundles of lies making code comprehension exponentially more complicated. After a few more years, I had editor toggles to set comments to background color to make them disappear.

      // Adds 18 to the specialValue
      specialValue += 5; // Adds 9 to the specialValue
      
      1. 2

        I kind of like that feature of heavily commented code, because it makes a nice canary to notice when people are writing patches without paying even bare-minimum attention to context. If someone’s written a patch, but not looked at enough context to even bother to update the comment immediately on the line above the code they changed, then that’s a big red flag to me, suggesting a lot more is likely to be wrong.

      2. 6

        In my experience, this is the best way to do it. I had to heavily document the project I worked on for my internship this past summer (because…internship, and I wouldn’t be there to explain what/why afterward). I basically had 3 levels of documentation:

        • High-level why stored in some markdown files (pulled in by jsdoc) and also available on the team’s wiki
        • Source code comments that explained why a specific piece of code did what it did. This was especially important in some hairy layout code that needed both what and why to make any sense of it. (For example: it can be pretty trivial to see that a piece of code aligns the x-coordinates of two elements. Why? Because this prevents a certain kind of nastiness in the layout)
        • Low-level what as the actual source code. Your usual grab-bag of sensible naming, abstraction where helpful, and componentization where possible.
      3. 9

        If it’s implementing a published algorithm, give the citation, and give the mapping of variable names from the paper to the code.

        Function and variable names should be in the 10 to 30 character range, as many words as needed to convey both the meaning of the values and where they came from. But no more words than needed, especially not any with lots of synonyms that would make it hard to remember which one I used.

        Describe invariants at the top of the function, always - except in a language with a strong enough type system that can express most of the invariants the program needs. In that scenario, go to extra effort to avoid exceptional functions that need textual description, because it could easily be missed.

        If there’s a long list of items, describe in a comment how they should be kept sorted, and any other steps needed when adding to it. But do try to refactor so there aren’t a lot of things that “need to agree” across files.

        For a given project, be consistent about the use or non-use of introductory text for each function. If it’s present, it should give a reader a clear idea of what parts of the code should call this function, and whether it takes responsibility for its own semantics or delegates them.

        For a project with more than about ten files, never omit a file introduction describing what motivates this file’s existence and why it’s not part of some other file.

        When there’s a long-running container at the top level of a file, like an #ifdef block or a namespace, make sure the scope terminator says what it terminates.

        If it needs a comment that isn’t for one of the above reasons, it’s probably doing something too complicated, and the comment will get out of date, so it should be refactored to not need it.

        In particular, your control flow should never be subtle enough to need a comment. There’s always a better way.

        If your variables lose their meaning before the end of the function, use a nested block so that’s obvious. Try to avoid needing this in JavaScript, which doesn’t have block scope. Don’t use a comment; it won’t help.

        1. 2

          When there’s a long-running container at the top level of a file, like an #ifdef block or a namespace, make sure the scope terminator says what it terminates.

          Everything above is excellent advice but THIS. THIS. A THOUSAND TIMES, THIS.

          1. 1

            Thanks :)

        2. 4

          With well-named functions/variables and a logical flow, comments can end up being pretty redundant. If it’s not clear what is happening and why, consider refactoring so it’s more clear (Granted, clarity may come at a cost of performance. Prioritize appropriately). If for whatever reason you cannot refactor it to be more straightforward, add comments. A common rule of thumb is to explain not what the code is doing, but why it is doing it.

          I personally don’t think it’s helpful to think of it as one side is right or wrong, or even more important. Comments and self-documenting code are both essential, but not necessarily at the cost of the other in all cases.

          1. 14

            Granted, clarity may come at a cost of performance. Prioritize appropriately

            To this point: unless you, your team, your boss and Knuth himself have all profiled the code in question and agree that performance is more important than clarity, pick clarity.

            99% of the time future you won’t be thinking “gee, I’m glad I squeezed those few extra cycles out here” - you’ll be saying “what in the fuck did I do that for?!”

          2. 3

            Commenting data structures is more important than the code. If you have good function names and halfway decent variable names, most code should generally be pretty readable.

            One thing people never talk about is the fact that comments go stale and can be wrong. Every comment you add is something that can go stale, and then it will be a negative, not a positive. So your comment may be good now, but you should think about how likely it is to stay good into the future.

            1. 1

              Commenting data structures is more important than the code.

              Yes! I also agree with Rob Pike that “data dominates” (sic) the design.

              When I write a new module, I usually put a long comment explaining the high-level design at the top of the file. More often than not, it tends to document data structures. Although there’s

              the fact that comments go stale and can be wrong.

            2. 3

              I usually write two types of comments:

              • Documentation for functions which are not immediately obvious or need some preconditions to the parameters (e.g. only accepts some range of integers)
              • Explanations for weird code that I know will make me say “WTF?!” if I read it after a couple of months (e.g. this needs to be this way because other component will break otherwise)

              Otherwise, as most here, I try to make code self-explanatory

              1. 3

                I view comments as a last resort for code that i have failed to make self-documenting and failed to document via tests.

                1. 3

                  When writing Python, typically I will not start writing comments until I have 200+ lines of code. That’s my first checkpoint where I start looking for abstractions, breaking common chunks out into functions, and writing short docu-comments for those functions: one sentence summarizing the intent of the function, and sometimes a few more lines describing the details of the function’s arguments, return type, and what exceptions it might throw. The larger the Python project, the more time I invest in docu-comments. Programs with less than 50 lines I might not comment at all.

                  In C++, the docu-comment convention is less strong, but I try to stick with it anyway – I do find documenting functions helps me keep things straight more than any other practice.

                  My rate of single-line comments is pretty constant. As I write code, if a line seems tricky or confusing to me I will write a short comment explaining the intent. I probably have about one of these comments per 15-40 lines of code. As C++ is more confusing than Python, I write more comments per intention. But C++ is also more verbose than Python, so my comment-to-lines of code ratio might be about the same.

                  1. 1

                    I tend to flesh it all out first, then start commenting and rewriting when my product is complete. Once I’m getting a desired output, I look at ways to improve my code by thinking about what I’m doing at every step of the process. This is based on preference and my personality – I’m better at ‘fixing things’ than thinking it all through before I even write the first character in my code.

                    1. 1

                      I find myself doing sort of the opposite on <200-line scripts, writing quite a lot of comments. Since in those kinds of scripts I haven’t done all the abstracting, breaking out into functions, etc., that you would in a more “properly” architected program, the comments instead serve that role of giving some structure. They delineate and gloss chunks of code, so I can easily scan the high-level logic without getting mired in the details of every line.

                      Not necessarily an example of Ideal Code, but here’s an excerpt from a random ~150-line Perl script I have lying around, to make this comment less abstract. At least to me, while the comments here aren’t strictly necessary, it would be less pleasant to come back to this script months later if I had left them out.

                      # first we grab everything in the rough bounding box, then filter
                      
                      # normal case: box doesn't cross the antimeridian
                      if ($lon_min > - pi and $lon_max < pi)
                      {
                         @box_pts = $index->get_items_in_rect($lat_min,$lon_min,$lat_max,$lon_max);
                      }
                      # other case: box does cross the antimeridian, so we need a box on each side
                      else
                      {
                         if ($lon_min <= - pi)  { $lon_min += 2*pi; }
                         elsif ($lon_max >= pi) { $lon_max -= 2*pi; }
                      
                         @box_pts =     $index->get_items_in_rect($lat_min,$lon_min,$lat_max,pi);
                         push @box_pts, $index->get_items_in_rect($lat_min,- pi,$lat_max,$lon_max);
                      }
                      
                      # now, filter with full distance test
                      foreach my $box_pt (@box_pts)
                      {
                         my $rad_dist = haversine_dist($lat,$lon,$box_pt->[1],$box_pt->[2]);
                         next unless ($rad_dist <= $rad_radius);
                         # etc.
                      }
                      
                    2. 3

                      Depends on the language, but I generally prefer code with copious comments. Especially so in C, where it often takes several lines of code to do even trivial things. If a block of 4 lines of code is doing something that can be summarized in a phrase, I find it much easier to scan the code if there’s a comment above the block with the phrase. In some languages this would be an indication that you should break out those 4 lines to a separate named function, but in C that’s not always practical/idiomatic.

                      1. 2

                        Yeah, I also use comments primarily for summarization so I can scan the file quickly and know what’s going on. As for breaking code into its own function, I find that having too many small functions which are each used only once can make it harder to understand what code is doing as I have to jump around more. That being said, if a function takes up more than a single page in my editor I’ll lean toward breaking it up because I like to come close as possible to understanding a function in one glance.

                      2. 1

                        I use function/action signatures combines with type names to minimize the amount of comments I have to include.

                        so instead of makeOffer :: Int -> Int -> Either ErrReport Acquired I’d do makeOffer :: Amount -> Price -> Either ErrReport Acquired

                        where both Amount,Price and Acquired are type synonyms for

                        newtype PInt = PInt Int which has it’s own (abused) typeclass enforcing truncated subtraction and a smart constructor assuring conversion from Int to PInt always results in a non-negative. I believe the code for this is self-documenting.

                        1. 1

                          I always prefer to readily understand what the code is doing (and why) from reading it, but sometimes you must represent some external complexity (say, confusing business rules) with complex code, and in these cases I would prefer to have comments than nothing.

                          1. 1

                            I don’t think it’s so much that they won’t read the comments as it is that when the code changes often the comments remain unchanged. It’s also extra and usually unnecessary work, when you need them though you need them.

                            1. 1

                              My preferred language is Haskell and I always give a type signature for every function. I generally try to keep functions under 10 lines if at all possible.

                              If I’m doing something “tricky” then I’ll put a comment in, because “self-documenting code” is often an excuse for not putting time into documentation. That said, with short functions and type signatures, it should only be maybe 1 function in 5 that requires a verbal comment.