1. 8
  1.  

  2. 5

    I like (and agree) with the sentiment, but the argument as presented is not convincing. I suspect that’s because it’s trying too hard to push the SCM product as opposed to talk about writing commits/checkins/whatever with the reviewers in mind.

    The case presented is not compelling because it’s just as plausible and possible to do in Git (and probably Mercurial too). Maybe PlasticSCM makes it easier? I’m not sure. Regardless, the point about squashing commits is weak since you could just as easily squash the entire series commits to the smaller series presented. Furthermore, there’s no reason the commit message on the single commit that touches over 100 files can’t be as descriptive as a small series of commits to help guide the reviewers.

    1. 3

      This is how the core Mercurial team works, btw. The unit of review is the commit, not the PR (which the core hg team doesn’t even really do).

      It produces commits that are each individually understandable, which is great because your log is actually readable and contains useful information:

      https://www.mercurial-scm.org/repo/hg/

      Look at how small commits tend to be, and look at how commit messages tend to explain just what this one change is doing. This also means that your commit history is now source-level documentation thanks to hg annotate/blame. The commit message is when your tools are forcing you to write something about your code, so you should take the opportunity to actually write something meaningful.

      A history that nobody takes time to write is one that nobody takes time to read either, and at that point, what you really wanted was an ftp server to host your code with the occasional rollback mechanism to undo bad uploads.

      1. 1

        Except for the advertising section, it’s pretty similar to what I ask for my team, that they commit per component or logical unit (altough they clearly aren’t listening, maybe I need to be more strict)

        They could also propose to use rebasing to transform the checkpoint form to the reviewer form, I undesrtand it could be used for that.

      2. 5

        I like the sentiment but think there are a couple of ways the essay could have developed it’s ideas further:

        1. One of the simplest ways to significantly improve the value of code review is to comment on your own PRs: not extensively, just two or three clarifications. We empirically know this reduces the number of defects that slips through, but nobody ever talks about it as a technique.

        2. The essay and (1) are partial answers to the same implicit problem: code review is supposed to be seen in a different viewpoint than code execution. But all of our tools only support seeing the “concrete” view of code. Trying to encode view information in the commit messages or PR comments are ultimately just hacks around that problem, and we can only “properly” solve it via better means of overlaying views on top of a codebase.

        1. 2

          I really like (1) and try to apply it often. Sometimes I know that someone will see my change, not knowing what was the purpose and not reading the original issue, and will let a comment because he lacks context or just trying to double check something.

          By commenting the “hot spots” I’m sure that my PR is easier to read and will help for a better review.

          1. 1

            I’d be interested in reading about (1)–do you have a reference I could look at?

            I’ve always felt like the diff should be readable without comments, but that opinion is not based on any real evidence.

            1. 2

              Best reference for this is here, where annotated code reviews had significantly fewer defects than reviews without them. The hypothesis is that it’s because writing annotations makes your oversights more obvious to you, kind of like how writers read prose aloud to see if it flows funny.

              1. 1

                Thanks, I’ve added it to my collection of notes on code reviews.

          2. 4

            Take a branch with only 3 small changes and it will get a whole lot of comments and suggestions. Take one with +100 changed files and it will get none.

            That’s a great example of bikeshedding.

            1. 2

              But it’s a real problem. Nobody reads long diffs. You want your code reviewed, don’t you? Then make it shorter. If it must be longer, then turn it into a lot of small diffs, each requiring no further or minimal context.

              1. 3

                Thankfully some people don’t buy into this self-defeating rhetoric and can and do read longer diffs when the changes required are longer. Constraining the length of a change works for some problems and under some conditions, but it’s not a fundamental good or even universally achievable.

                1. 4

                  It’s not a self-defeating rhetoric. It’s just hard to pay attention when your work is longer. It’s not because people are lazy or stupid or some other wrong thing. It’s because we’re humans and we are just not good at reading long and rambling bits of code that someone else wrote all at once. When every line of a giant hairball diff involves a context switch, nobody is going to read that, not even the original author.

                  I’m not saying you can’t make long changes. I am saying that you should split up your long changes. Split them as much as possible so each change has the least context possible to be understood. Books have sentences, paragraphs, chapters. Code has functions, modules, source files, repositories. Code review should use commits as the demarcation.

                  Move the effort of making the diff understandable to the writer, not the reader.

                  1. 1

                    Obviously if a change is rambling, it could probably stand to be improved – but something can be long without being rambling.

                    When every line of a giant hairball diff involves a context switch, nobody is going to read that, not even the original author.

                    I think you may be projecting a little. To stack my anecdote alongside yours, I have both read and written longer changes that required a lot of context to understand. I agree that it takes longer, which perhaps means I won’t get to do it all in one sitting – but I can take notes about my thoughts, as I would encourage all engineers to do, and I can pick up where I left off.

                    Split them as much as possible so each change has the least context possible to be understood.

                    I agree this can be beneficial when it’s possible, I just don’t think it always is. I’ve definitely seen people err too far on the side of microscopic changes. While the tiny change at issue may seem correct in isolation, the broader context is often actually very important and by avoiding understanding it you’re not going to give or get a very thorough review.

                    Code review, like designing and writing the code in the first place, and like testing it, takes time and energy. There’s just no magic bullet when the goal is thoughtful, consistent, and rigorous change to the software.

                    1. 2

                      The data is on the side of shorter reviews.

                      Our results suggest that review effectiveness decreases with the number of files in the change set. Therefore, we recommend that developers submit smaller and incremental changes whenever possible, in contrast to waiting for a large feature to be completed.

                      https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/bosu2015useful.pdf

                      Reviews should be of changes that are small, independent, and complete.

                      http://users.encs.concordia.ca/~pcr/paper/Rigby2012IEEE.pdf (based on data from https://users.encs.concordia.ca/~pcr/paper/Rigby2011Dissertation.pdf )

                      There is no large code change that cannot be split up into incremental, meaningful changes. It takes training to recognise those atomic boundaries, but they exist, and using them is helpful for reviewers.

                  2. 1

                    This is one of the things I really like about Go: the language designers explicitly design features to enable easier incremental changes.

              2. 1

                Very interesting points but way too much marketing on their products…

                It makes sense and I think it resonates well with people who had to go through (or reject) absurdly huge PR/MR’s.