This is interesting, because it also comes down to how the reviewer reviews code. If the reviewer just looks at the changes in a PR as a whole, a lot of the value in careful commits becomes about the historical record than ease of reviewing. Still valuable, but there’s something to be said for learning to review code.
That was one of my motivations to write this in the first place. Now I can just send a link to the ones on my team.
I agree that how we write history is important on its own merits. You commit once, but it will be read a lot more. Both during review, but also as part of later debugging.
It’s something I try and stick to even if it takes a bit of time. git add -p <filename> is invaluable for this too.
When I see people reviewing things as a giant chunk of code, I feel that sometimes it comes from a history of commits not being very useful. When commits are more noise than signal, I don’t blame people for not looking at them.
Another thing is familiarity with rebasing. I think there’s a lot of talk about rebasing being difficult, which means people don’t attempt it because they’re scared to lose work, which is entirely reasonable, but we need to talk more about recovery techniques like the reflog, etc.
The simplest being to make a test branch on which to test. If all goes well, you can reset you branch to the test branch’s commit and proceed from there.
I don’t know if people understand the lightness and ease of branches, but it’s simpler than cherry-picking from reflog.
Git is excellent in how hard it is to truly fsck up, and people should learn these techniques and not consider them magic tricks.
Your motivation resonates with me; I’ve been writing something partially for my coworkers about my thoughts on commits and reviews, plus explaining some of my workflows… Part one is here: https://medium.com/p/c0966a562b10
I actually spend a lot of time on straightening out my branch’s commits before I submit a pull request. I use magit, and do a lot of manipulation after the fact (squashes, fixups, and changing the previous commit message), but even if the tooling didn’t make them so easy, the act of going through the commits and thinking hard about the whys and why nots that I put in commit messages has been helpful in me finding corner cases and other things that need to be fixed/double checked before somebody else reviews the code.
Another habit I’ve been building (but am constantly re-evaluating) is to add TODO: comments to the commit messages and addressing them (via code or an explanation in a commit message) before I submit my code.
git merge --squash is one of my key tools. Do a pile of work in a local branch, committing willy-nilly. Once the unit of work is done, merge it into a shared branch with the --squash flag.
As someone who’s been trying to get their team at work to do this, I couldn’t agree more.
A related thought I’ve had recently: there’s (in some ways at least) sort of of a continuum between code review and pair programming. As commits get increasingly fine-grained and well-described (in their commit messages), reviewing them and providing feedback on each starts to resemble the reviewer and the author collaboratively developing the code, only it’s more asynchronous (i.e. doesn’t require both people to be sitting in front of the same screen at the same time).
My take on pair programming is that it often reduces the number of errors during the pre-commit phase. I don’t think that pair programming removes any and all bugs because of its synchronous nature.
Doing code review asynchronous could go in any direction. If the reviewer isn’t focused at that particular time, it would lead to a sub-optimal review process. However, I would like to argue that a reviewer in the zone would be best suited to find any issues. Being asynchronous means that you have better time to understand the code and can test it yourself.
You’ll often get dragged in during a synchronous pair programming session. You hear your colleague’s train of thoughts and get excited about the task at hand. This is where I lose some of my ability to think twice about an idea.
The same could probably get archived by looking at the diff alone the day after a pair programming session. It might reveal a new perspective.
Where I work, we don’t review the commit history of the topic branch, because the PR gets squashed into one commit at the end anyway. We do put some work into writing a description when we submit the PR for review, but not the commit messages on the topic branch.
In my case, there are none or few valuable commit messages at first. I’m not that disciplined. That’s why I’ll do sub-par commit messages at first. When the entire feature is done, it’ll be a lot clearer how it evolved and what would be logical units of work.
I agree that if you get it right the first time around, a rebase would be of no use. That might be a skill I eventually get the hang of.
I guess for us it’s kind of easy because we develop Plastic SCM with Plastic SCM (which forces us to do branch per task development). More on that here:
I was actually going to write an article on this with “Commit Driven Development” or something similar title. The idea is to commit each successive code update in program/project so that while looking over the commit one can understand the evolution of it.
This help in two ways: 1. It shows how a simple program becomes large application/project and 2. Teach project development with an explanation so that one can follow to write a similar program to learn the development.
Another fun idea would be to make a base project with some basic features, and then give a group of students a few tasks to complete within the existing code base. After completing the work, the group could discuss how they have solved it and how that is reflected in the commit messages. There would probably be a lot of different takes on the assignment.
Exactly. I’ve tried to document a similar thing ( but not able to complete ). Here is the example, that I was writing but with little different context and then I thought about “commit driven dev..”
I know this link does not look good in terms of CSS.
usually a commit should be atomic, but there is a point where rearranging the history to achieve nice and small meaningful patches becomes prohibitively difficult: for large refactoring, especially with the API going back and forth or with interleaved cosmetic changes, rebase conflicts become extremely complex and take days to sort out. So, even though i am an atomic-commit evangelist, on rare occasions our team has agreed to merge huge code bombs instead of spending another month on creating intermediate refactorings that still pass all test suites.
I am wondering if commits added during code review should be “fixup” of the previous commit. Those fixup commits would be autosquashed before the branch gets merged.
It did feel “wrong” to recommend using --force-with-lease during the review process. Using git commit with either --fixup or --squash after code review would be best of both worlds.
And more importantly, the commit message should focus on why you made this change.
Actually the body of the message should concentrate on why but the summary (first line) is fine with what was done. See this detailed example.
This is also repeated in “A Note About Git Commit Messages”:
Also a somewhat controversial addition to the first line are emojis
https://gitmoji.carloscuesta.me/
To classify commits in addition to what imperative verbs allow.
Very true. I’ll try to make that a part of the workflow as well. It will be interesting to see how my teammates will respond to this.
This is interesting, because it also comes down to how the reviewer reviews code. If the reviewer just looks at the changes in a PR as a whole, a lot of the value in careful commits becomes about the historical record than ease of reviewing. Still valuable, but there’s something to be said for learning to review code.
That was one of my motivations to write this in the first place. Now I can just send a link to the ones on my team.
I agree that how we write history is important on its own merits. You commit once, but it will be read a lot more. Both during review, but also as part of later debugging.
It’s something I try and stick to even if it takes a bit of time.
git add -p <filename>
is invaluable for this too.When I see people reviewing things as a giant chunk of code, I feel that sometimes it comes from a history of commits not being very useful. When commits are more noise than signal, I don’t blame people for not looking at them.
Another thing is familiarity with rebasing. I think there’s a lot of talk about rebasing being difficult, which means people don’t attempt it because they’re scared to lose work, which is entirely reasonable, but we need to talk more about recovery techniques like the reflog, etc.
Magit makes it painless.
The simplest being to make a test branch on which to test. If all goes well, you can reset you branch to the test branch’s commit and proceed from there.
I don’t know if people understand the lightness and ease of branches, but it’s simpler than cherry-picking from reflog.
Git is excellent in how hard it is to truly fsck up, and people should learn these techniques and not consider them magic tricks.
Your motivation resonates with me; I’ve been writing something partially for my coworkers about my thoughts on commits and reviews, plus explaining some of my workflows… Part one is here: https://medium.com/p/c0966a562b10
I actually spend a lot of time on straightening out my branch’s commits before I submit a pull request. I use magit, and do a lot of manipulation after the fact (squashes, fixups, and changing the previous commit message), but even if the tooling didn’t make them so easy, the act of going through the commits and thinking hard about the whys and why nots that I put in commit messages has been helpful in me finding corner cases and other things that need to be fixed/double checked before somebody else reviews the code.
Another habit I’ve been building (but am constantly re-evaluating) is to add
TODO:
comments to the commit messages and addressing them (via code or an explanation in a commit message) before I submit my code.git merge --squash
is one of my key tools. Do a pile of work in a local branch, committing willy-nilly. Once the unit of work is done, merge it into a shared branch with the--squash
flag.As someone who’s been trying to get their team at work to do this, I couldn’t agree more.
A related thought I’ve had recently: there’s (in some ways at least) sort of of a continuum between code review and pair programming. As commits get increasingly fine-grained and well-described (in their commit messages), reviewing them and providing feedback on each starts to resemble the reviewer and the author collaboratively developing the code, only it’s more asynchronous (i.e. doesn’t require both people to be sitting in front of the same screen at the same time).
My take on pair programming is that it often reduces the number of errors during the pre-commit phase. I don’t think that pair programming removes any and all bugs because of its synchronous nature.
Doing code review asynchronous could go in any direction. If the reviewer isn’t focused at that particular time, it would lead to a sub-optimal review process. However, I would like to argue that a reviewer in the zone would be best suited to find any issues. Being asynchronous means that you have better time to understand the code and can test it yourself.
You’ll often get dragged in during a synchronous pair programming session. You hear your colleague’s train of thoughts and get excited about the task at hand. This is where I lose some of my ability to think twice about an idea.
The same could probably get archived by looking at the diff alone the day after a pair programming session. It might reveal a new perspective.
Where I work, we don’t review the commit history of the topic branch, because the PR gets squashed into one commit at the end anyway. We do put some work into writing a description when we submit the PR for review, but not the commit messages on the topic branch.
Despite having the same goal as the author we argue against squashing and interactive rebasing here:
http://blog.plasticscm.com/2018/10/checkin-with-reviewers-in-mind-how-to-fix-pull-requests.html
To quote the link:
In my case, there are none or few valuable commit messages at first. I’m not that disciplined. That’s why I’ll do sub-par commit messages at first. When the entire feature is done, it’ll be a lot clearer how it evolved and what would be logical units of work.
I agree that if you get it right the first time around, a rebase would be of no use. That might be a skill I eventually get the hang of.
I guess for us it’s kind of easy because we develop Plastic SCM with Plastic SCM (which forces us to do branch per task development). More on that here:
http://blog.plasticscm.com/2016/06/plastic-vs-git-rerere-and-rebase.html
I was actually going to write an article on this with “Commit Driven Development” or something similar title. The idea is to commit each successive code update in program/project so that while looking over the commit one can understand the evolution of it.
This help in two ways: 1. It shows how a simple program becomes large application/project and 2. Teach project development with an explanation so that one can follow to write a similar program to learn the development.
Another fun idea would be to make a base project with some basic features, and then give a group of students a few tasks to complete within the existing code base. After completing the work, the group could discuss how they have solved it and how that is reflected in the commit messages. There would probably be a lot of different takes on the assignment.
Learning outcomes would include:
Exactly. I’ve tried to document a similar thing ( but not able to complete ). Here is the example, that I was writing but with little different context and then I thought about “commit driven dev..”
I know this link does not look good in terms of CSS.
usually a commit should be atomic, but there is a point where rearranging the history to achieve nice and small meaningful patches becomes prohibitively difficult: for large refactoring, especially with the API going back and forth or with interleaved cosmetic changes, rebase conflicts become extremely complex and take days to sort out. So, even though i am an atomic-commit evangelist, on rare occasions our team has agreed to merge huge code bombs instead of spending another month on creating intermediate refactorings that still pass all test suites.
I am wondering if commits added during code review should be “fixup” of the previous commit. Those fixup commits would be autosquashed before the branch gets merged.
What do you think about this?
Great idea! Thank you for sharing :)
It did feel “wrong” to recommend using
--force-with-lease
during the review process. Usinggit commit
with either--fixup
or--squash
after code review would be best of both worlds.Here’s a great talk on this exact topic https://www.youtube.com/watch?v=qpdYRPL3SVE