Awesome post! This gist was a bit of feedback we got for pr.pico.sh and are halfway through the process of generating automatic range diffs when user pushes changes
This is a great article. Interdiffs are fantastic, and more people and tools should use them.
We introduced them to Review Board back in 2006. There wasn’t really much in the way of code review tools then, so we were kind of flying blind (and Git was new — we were largely supporting Perforce, Subversion, and some others), but interdiffs was something that just felt like such an amazing way to review code. I don’t know if the term “interdiffs” was in use back then or if we arrived at it independently, but we named it that and then saw new tools coming a few years later adopt the same term.
When I have to review code on GitHub, I miss this. I know many people just use fix-it commits, but I definitely agree that they’re just not a suitable alternative, at least not for everyone. My takes on this:
Fix-it commits obscure history. If you just look at fix-it commits, you don’t really know what changes are made against upstream vs. a previous commit. You lose a lot of context that’s helpful in providing a good review.
They’re a mess in the commit graph and make for more difficult reviews. If they remain when you push, that’s even worse. I’m all for keeping history around, but think there’s benefit in each push being a final product and not a series of “And then I tried this, and then I tried this, and then…”.
They don’t help enough with reviewers who are starting out at different parts of the review process. If you’ve been following along with a review, you may have already read all the code leading up to the latest change, and have gone through those fix-its, so you know how it’s evolved. If you’re coming in fresh part-way through the review process, you have to dig up all that history and try to figure out what the final iteration looks like, and that can lead to misunderstandings and confusion.
Not everyone uses Git or other multi-commit-capable SCMs. There are developers who work with Perforce (such as in gamedev or other areas where large binary files are a factor) or on specialized SCMs like Keysight SOS (which is used by chip manufacturers and has some unique features). Even Subversion’s still in use in plenty of companies (don’t get me started on CVS usage). So you really wouldn’t want to even try for fix-it commits here.
When you do interdiffs right, people can jump into the code review process at any point. Someone who started with the first draft of a change can follow along with each update without having to re-read everything. Someone coming in later can skip all the history of that draft change and start with the latest version up for review.
I’ve seen others say it’s an either-or thing, but if you can do interdiffs of commit trees, you can really do some neat things. Say I have a small project that’s self-contained but touches upon several areas of responsibility, but the context of those changes are important for understanding the whole, so I want them available in the same review request/pull request.
I post one review request broken up into N commits. Then I get feedback, make some changes, amend some of my existing commits with the fixes, and maybe add a new commit or two if I need to break something out. Reviewers see I have some new changes, and go to review. They can just look at the interdiff of each commit and of the commit tree itself, seeing what changes I’ve made to my existing commits and what new commits I’ve introduced. They don’t have to juggle the commits in their head, try to figure out what fix-it commit applies where, try to remember what a previous version of some commit looked like, etc.
GitHub should have probably added this long ago, but I’m always happy to see people discover interdiffs and bring them into their workflow. I think it’s one of the more powerful tools in code review.
Evolve adds markers when rewriting commits. Commits can obsolete commits in a straightforward one-to-one fashion (one commit replaces another), one-to-many (splitting a commit), or many-to-one fashion (squashing a commit). Once you have the obsolescence markers, it removes the guesswork of how to write interdiffs.
There have been plans of implementing evolve for git, but I don’t think any of them have gotten beyond the planning stage.
I’m someone who’s traditionally done painful things, like run Gentoo, for the fun of it, so I’m not stranger to self-inflicted pain, but this is too much for me. Maybe I just don’t understand, or am not disciplined in git workflows well enough.
I wouldn’t be doing it if I didn’t have a nice GUI — Fork — that makes interactive rebasing super easy, basically drag-and-drop. I use it constantly when I’m preparing a PR.
I think it must be something like that? I do this routinely, and it takes very little effort and time, particularly with the right aliases. (And I don’t mean anything crazy; just ri for git rebase -i, c for git commit, that kind of thing — enough to make the individual operations as painless and easy as ls or cd.)
Reviving this thread to point out that I’m now a believer in this, and it’s not bad. It’s worked well for the mono repo I work on. There’s a few warts, but some of that is GitHub, or standard “I made a git booboo.”
I’m surprised this post didn’t once mention jujutsu, felt like a sales pitch for it all along ;D
I do agree about GitHub’s review systems being pretty bad but the inertia behind it makes it annoying to introduce alternatives at work, people often just don’t want things to change, even for the better.
Wait, are you telling me that people are using GitHub PRs without squashing all the commits into one? Every project I’ve worked on follows this practice, and to me, it’s the only reasonable way to maintain a clean, linear history. Each commit references a PR, and in the end, every PR results in a single commit in the history.
Uh, yes? :) The not-very-happy but workable medium I’ve found is to do a PR with 3 commits (in the OP’s example), then do fixup! commits in response to GH comments, then rebase -i to put the fixups back into their proper commits before merging (fixup! makes the last step easy).
I’m also a fan of fixup! commits, I just wish GitHub would autosquash them on merge or rebase. There’s always a chance someone clicks the button and they end up in your final tree.
I’ve worked in multiple hundreds of projects on GitHub that don’t do that — it is by far the less popular option. Personally, I prefer to keep commit history around instead of squashing whole PRs. (It’d be nicer if the commits were well-rounded and logical per the article, which I generally strive for, but even when not I still find the history valuable.)
I think your experience there is unique. I don’t think most GH projects do that.
Squashing, in particular, trades off cleanliness for losing history. This can be detrimental if you need to bisect, or see the evolution of a PR. On the upside, there’s no need to manually rewrite history, and there shouldn’t be uncompilable/test-failing commits in the public history.
Nice, I like the interdiffs. Hoping GitHub (or more likely Forgejo) better supports code review cycles and rebasing workflows in the future.
GitHub’s lack of focus on these areas is really frustrating, but on balance I still prefer it to learning Gerrit.
My hope is the open source alternatives can innovate. Forgejo’s support for the rebase-then-merge strategy was already enough to win me over. “interdiffs” would really seal the deal.
This seems so much better, at least for the reviewer. But how do you do this in practice, as a committer? Every time some review is done, you make a new branch? What if you’ve pushed commits A->B->C and you yourself notice that A and C need minor fixups, do you then also make a new branch with A’->B->C’ ?
Rebase and push --force-with-lease the same branch. Gerrit has additional IDs in the commit message to match up commits from one version to the next.
There’s also Mercurial’s “changeset evolution” extension which adds better support for this workflow. Jujutsu is designed to support it as the primary way of working.
Ooh, so you do need a review tool like Gerrit to remember this info which is outside of git. Though it seems from https://git.eclipse.org/r/Documentation/user-changeid.html that one could even use this without Gerrit, then technically one could recreate the pre-force-push branch from reflog. (I’m wondering how much of this is possible to do without adding new tools.)
A github pull request based workflow on a shared central repository in a corporate environment
Linux kernel mailing list process
Gerrit in a corporate environment
Github pull requests are by far the most aesthetically pleasing and superficially nice experience, but I think that of the 3 options above, they are the most fundamentally flawed in terms of the workflow. As mentioned in the article, you end up adding little fixup commits to address code review comments. If you’re really careful and thorough, you can try to make sure that each fixup commit only relates to a single commit in the original PR and then you can rebase once the PR is approved and squash the fixup commits into the commit they were updating. This cleans up the long-term history of the repository a bit. Another thing that stinks about github PRs is that there is no way to really have a conversation about a commit message and then update the commit message cleanly. In my experience, if you start rebasing a branch during review everything gets messed up.
The Linux kernel mailing list process is sort of esoteric and beautiful, but a bit daunting to deal with. E-mail in general is much less relevant in people’s lives than it used to be. Most people communicate via some messaging service on their phone or using Slack, Teams, Discord, etc. So that leaves e-mail for automated communication from websites and an identity authentication system. Most people are using webmail (e.g. gmail) or mandated corporate client like Outlook and neither of these platforms really emphasize plaintext. I think the barrier to entry in terms of tools and configuration is a bit high in this model. Another issue is that conversations can explode and there isn’t any way to “resolve” a conversation like you can in a github PR so that it is hidden from focus.
Gerrit works pretty well, but I dislike the git push origin HEAD:refs/for/master type incantations because they feel a bit arcane and bolted on compared to vanilla git. Gerrit is also far less visually polished than github. It feels clunky and old. Reviewing code in Gerrit is pretty nice because it’s easy to get either a full picture of everything that has changed or to see how a given patch has evolved since the last round of review.
Awesome post! This gist was a bit of feedback we got for pr.pico.sh and are halfway through the process of generating automatic range diffs when user pushes changes
This is a great article. Interdiffs are fantastic, and more people and tools should use them.
We introduced them to Review Board back in 2006. There wasn’t really much in the way of code review tools then, so we were kind of flying blind (and Git was new — we were largely supporting Perforce, Subversion, and some others), but interdiffs was something that just felt like such an amazing way to review code. I don’t know if the term “interdiffs” was in use back then or if we arrived at it independently, but we named it that and then saw new tools coming a few years later adopt the same term.
When I have to review code on GitHub, I miss this. I know many people just use fix-it commits, but I definitely agree that they’re just not a suitable alternative, at least not for everyone. My takes on this:
Fix-it commits obscure history. If you just look at fix-it commits, you don’t really know what changes are made against upstream vs. a previous commit. You lose a lot of context that’s helpful in providing a good review.
They’re a mess in the commit graph and make for more difficult reviews. If they remain when you push, that’s even worse. I’m all for keeping history around, but think there’s benefit in each push being a final product and not a series of “And then I tried this, and then I tried this, and then…”.
They don’t help enough with reviewers who are starting out at different parts of the review process. If you’ve been following along with a review, you may have already read all the code leading up to the latest change, and have gone through those fix-its, so you know how it’s evolved. If you’re coming in fresh part-way through the review process, you have to dig up all that history and try to figure out what the final iteration looks like, and that can lead to misunderstandings and confusion.
Not everyone uses Git or other multi-commit-capable SCMs. There are developers who work with Perforce (such as in gamedev or other areas where large binary files are a factor) or on specialized SCMs like Keysight SOS (which is used by chip manufacturers and has some unique features). Even Subversion’s still in use in plenty of companies (don’t get me started on CVS usage). So you really wouldn’t want to even try for fix-it commits here.
When you do interdiffs right, people can jump into the code review process at any point. Someone who started with the first draft of a change can follow along with each update without having to re-read everything. Someone coming in later can skip all the history of that draft change and start with the latest version up for review.
I’ve seen others say it’s an either-or thing, but if you can do interdiffs of commit trees, you can really do some neat things. Say I have a small project that’s self-contained but touches upon several areas of responsibility, but the context of those changes are important for understanding the whole, so I want them available in the same review request/pull request.
I post one review request broken up into N commits. Then I get feedback, make some changes, amend some of my existing commits with the fixes, and maybe add a new commit or two if I need to break something out. Reviewers see I have some new changes, and go to review. They can just look at the interdiff of each commit and of the commit tree itself, seeing what changes I’ve made to my existing commits and what new commits I’ve introduced. They don’t have to juggle the commits in their head, try to figure out what fix-it commit applies where, try to remember what a previous version of some commit looked like, etc.
GitHub should have probably added this long ago, but I’m always happy to see people discover interdiffs and bring them into their workflow. I think it’s one of the more powerful tools in code review.
So Mercurial’s evolve system is specifically designed to allow this sort of code review.
https://www.mercurial-scm.org/doc/evolution/
Evolve adds markers when rewriting commits. Commits can obsolete commits in a straightforward one-to-one fashion (one commit replaces another), one-to-many (splitting a commit), or many-to-one fashion (squashing a commit). Once you have the obsolescence markers, it removes the guesswork of how to write interdiffs.
There have been plans of implementing evolve for git, but I don’t think any of them have gotten beyond the planning stage.
Jujutsu seems to be the most promising implementation of changeset evolution for git, tho it’s much more than that.
The author of this gist is on the jj team.
Is this blog post saying Gerritt is an interdiff code review system?
Yes.
I’ll copy my nitpick from the orange site:
There’s a “commits” tab at the top, like:
An example
Yes, and it’s useless when you have “fix it” commits. That’s the whole point here.
Not if the author uses rebasing to update each commit with fixes, then force-pushes the branch again, as advocated here. I do this.
I’m someone who’s traditionally done painful things, like run Gentoo, for the fun of it, so I’m not stranger to self-inflicted pain, but this is too much for me. Maybe I just don’t understand, or am not disciplined in git workflows well enough.
I wouldn’t be doing it if I didn’t have a nice GUI — Fork — that makes interactive rebasing super easy, basically drag-and-drop. I use it constantly when I’m preparing a PR.
I think it must be something like that? I do this routinely, and it takes very little effort and time, particularly with the right aliases. (And I don’t mean anything crazy; just
riforgit rebase -i,cforgit commit, that kind of thing — enough to make the individual operations as painless and easy aslsorcd.)Reviving this thread to point out that I’m now a believer in this, and it’s not bad. It’s worked well for the mono repo I work on. There’s a few warts, but some of that is GitHub, or standard “I made a git booboo.”
Nice, really glad and cool to hear!
I’m surprised this post didn’t once mention jujutsu, felt like a sales pitch for it all along ;D
I do agree about GitHub’s review systems being pretty bad but the inertia behind it makes it annoying to introduce alternatives at work, people often just don’t want things to change, even for the better.
The author is on the jj team, so they’re well aware. I think they were just trying to focus on the topic at hand, and there’s no jjhub yet :)
Wait, are you telling me that people are using GitHub PRs without squashing all the commits into one? Every project I’ve worked on follows this practice, and to me, it’s the only reasonable way to maintain a clean, linear history. Each commit references a PR, and in the end, every PR results in a single commit in the history.
Uh, yes? :) The not-very-happy but workable medium I’ve found is to do a PR with 3 commits (in the OP’s example), then do
fixup!commits in response to GH comments, thenrebase -ito put the fixups back into their proper commits before merging (fixup!makes the last step easy).I’m also a fan of
fixup!commits, I just wish GitHub would autosquash them on merge or rebase. There’s always a chance someone clicks the button and they end up in your final tree.I’ve worked in multiple hundreds of projects on GitHub that don’t do that — it is by far the less popular option. Personally, I prefer to keep commit history around instead of squashing whole PRs. (It’d be nicer if the commits were well-rounded and logical per the article, which I generally strive for, but even when not I still find the history valuable.)
I think your experience there is unique. I don’t think most GH projects do that.
Squashing, in particular, trades off cleanliness for losing history. This can be detrimental if you need to bisect, or see the evolution of a PR. On the upside, there’s no need to manually rewrite history, and there shouldn’t be uncompilable/test-failing commits in the public history.
Nice, I like the interdiffs. Hoping GitHub (or more likely Forgejo) better supports code review cycles and rebasing workflows in the future.
GitHub’s lack of focus on these areas is really frustrating, but on balance I still prefer it to learning Gerrit.
My hope is the open source alternatives can innovate. Forgejo’s support for the rebase-then-merge strategy was already enough to win me over. “interdiffs” would really seal the deal.
Phabricator is indeed RIP but there is a fork trying to keep it alive https://we.phorge.it/
This seems so much better, at least for the reviewer. But how do you do this in practice, as a committer? Every time some review is done, you make a new branch? What if you’ve pushed commits A->B->C and you yourself notice that A and C need minor fixups, do you then also make a new branch with A’->B->C’ ?
Rebase and push
--force-with-leasethe same branch. Gerrit has additional IDs in the commit message to match up commits from one version to the next.There’s also Mercurial’s “changeset evolution” extension which adds better support for this workflow. Jujutsu is designed to support it as the primary way of working.
Ooh, so you do need a review tool like Gerrit to remember this info which is outside of git. Though it seems from https://git.eclipse.org/r/Documentation/user-changeid.html that one could even use this without Gerrit, then technically one could recreate the pre-force-push branch from reflog. (I’m wondering how much of this is possible to do without adding new tools.)
In my understanding, the git team is considering adding a way for this kind of metadata to be stored in git itself, which will be nice.
That would be wonderful 🌈 🦄
In somewhat recent memory, I have used:
Github pull requests are by far the most aesthetically pleasing and superficially nice experience, but I think that of the 3 options above, they are the most fundamentally flawed in terms of the workflow. As mentioned in the article, you end up adding little fixup commits to address code review comments. If you’re really careful and thorough, you can try to make sure that each fixup commit only relates to a single commit in the original PR and then you can rebase once the PR is approved and squash the fixup commits into the commit they were updating. This cleans up the long-term history of the repository a bit. Another thing that stinks about github PRs is that there is no way to really have a conversation about a commit message and then update the commit message cleanly. In my experience, if you start rebasing a branch during review everything gets messed up.
The Linux kernel mailing list process is sort of esoteric and beautiful, but a bit daunting to deal with. E-mail in general is much less relevant in people’s lives than it used to be. Most people communicate via some messaging service on their phone or using Slack, Teams, Discord, etc. So that leaves e-mail for automated communication from websites and an identity authentication system. Most people are using webmail (e.g. gmail) or mandated corporate client like Outlook and neither of these platforms really emphasize plaintext. I think the barrier to entry in terms of tools and configuration is a bit high in this model. Another issue is that conversations can explode and there isn’t any way to “resolve” a conversation like you can in a github PR so that it is hidden from focus.
Gerrit works pretty well, but I dislike the
git push origin HEAD:refs/for/mastertype incantations because they feel a bit arcane and bolted on compared to vanilla git. Gerrit is also far less visually polished than github. It feels clunky and old. Reviewing code in Gerrit is pretty nice because it’s easy to get either a full picture of everything that has changed or to see how a given patch has evolved since the last round of review.Ht @steveklabnik
Hi! Glad this resonated with so many folks.