1. 30

  2. 10

    code review often results in me having to amend or squash my commit(s).

    Why? What is wrong with fixing whatever needs to be fixed in new commits?

    Sure, amend, squash, modify before you push, but after that, don’t, and you avoid a whole class of problems.

    You might argue that the history will look “messy”, yes, perhaps, but it also reflects what actually happened, which can be a good thing.

    1. 19

      git history should tell a story, i don’t want to see your typos, unless it’s in the main branch, then it’s written in stone

      1. 3

        I don’t see why. VC history can be an almost arbitrary mess!

        The thing which really matters is that you get your job done.

        As long as you have a decent way to

        1. find semantically connected commits (e.g. look at the merge of a PR, or a ticket ID in the commit messages) and
        2. find out who to ask when you have questions about some code (some version of blame)

        you should be good. At least, that is all I ever needed from a VCS. I would be interested in hearing about other use-cases, though.

        In general, people are wasting too much time these days cleaning up their commit history.

        1. 5

          as somebody regularly doing code archeology in a project that is now 16 years old and has gone through migrations from CVS to SVN to git, to git with people knowing how to rebase for readable histories, I can tell you that doing archeology in nice single-purpose commits is much nicer than doing archeology within messy commits.

          So I guess it depends. If the project you’re working on is a one-off, possibly rewritten or sunset within one or two years, sure, history doesn’t matter.

          But if your project sticks around for the long haul, you will thank yourself for not committing typo fixes and other cleanup commits.

          1. 4

            it CAN be, but that’s what we’re trying to avoid
            you can get your job done either way, and cleaning up git history doesn’t take a lot of time if you think properly from the beginning. Any additional time i do spend can easily be easily justified by arguing for better documented changes.

            1. sure
            2. you should not have to ask anyone, some aggregation of context, commit messages, and comments should answer any questions you have

            having a mistake you introduced, as well as the fix for that mistake in the same branch before merging into a main branch is just clutter… unnecessary cognitive load. If you use git blame properly, it’s yet another hoop you have to jump through and try to find out the real reason behind a change. Now, there are exceptions. Sometimes i do introduce a new feature with a problem in a branch, and happen to discover it and fix it in the same branch (usually it’s because the branch is too long lived, which is a bad thing). I do, sometimes, decide that this conclusion is important to the narrative, and decide to leave it in.

            1. 2

              I mean…I would agree in principle except for “cleaning up git history doesn’t take a lot of time”. I think that is only true if you have already invested a lot of time into coming up with your branching and merging and squashing model and another lot of time figuring out how to implement it with your tools.

              I have probably more cognitive overhead from reading blog posts on to-squash-or-not-squash et al. than I could get from ignoring “fix typo” commits in a lifetime. ;)

              1. 4

                “cleaning up source history” is such an ingrained part of my work flow, that seeing you dismiss it because it’s too costly reads similarly to me was, “I don’t have time to make my code comprehensible by using good naming, structure and writing good docs.” Which you absolutely could justify by simply saying, “all that matters is that you get the job done.” Maybe. But I’d push back and say: what if a cleaner history and cleaner code makes it easier to continue doing your job? Or easier for others to help you do the job?

                FWIW, GitHub made this a lot easier with their PR workflow by adding the “squash and merge” option with the ability to edit the commit message. Otherwise, yes, I’ll checkout their branch locally, do fixups and clean up commit history if necessary.

                1. 1

                  I could make that argument. But I didn’t because it is not the same thing.

                  This is exactly why I gave examples and asked for more! I haven’t found any use for a clean commit history. And - also answering @pilif here - this includes a medium sized (couple million lines of code), 30 year old project that had been rewritten in different languages twice and the code base at that time consisted of 5 different languages.

                  (The fact that cleaning up history is such an ingrained part of your work flow doesn’t necessarily mean anything good. You might also just be used to it and wasting your time. You could argue that it is so easy that it’s worth doing even if there is no benefit. Maybe that’s true. Doesn’t seem like it to me at this point.)

                  1. 5

                    But I didn’t because it is not the same thing.

                    Sure, that’s why I didn’t say they weren’t the same. I said they were similar. And I said they were similar precisely because I figured that if I said they were the same, someone would harp on that word choice and point out some way in which they aren’t the same that I didn’t think of. So I hedged and just said “similar.” Because ultimately, both things are done in the service of making interaction with the code in the future easier.

                    This is exactly why I gave examples and asked for more! I haven’t found any use for a clean commit history.

                    I guess it seems obvious to me. And it’s especially surprising that you literally haven’t found any use for it, despite people listing some of its advantages in this very thread! So I wonder whether you’ll even see my examples as valid. But I’ll give a try:

                    • I frequently make use of my clean commit history to write changelogs during releases. I try to keep the changelog up to date, but it’s never in sync 100%, so I end up needing to go through commit history to write the release notes. If the commit history has a bunch of fixup commits, then this process is much more annoying.
                    • Commit messages often serve as an excellent place to explain why a change was made. This is good not only for me, but to be able to point others to it as well. Reading commit messages is a fairly routine part of my workflow: 1) look at code, 2) wonder why it’s written that way, 3) do git blame, 4) look at commit that introduced it. Projects that don’t treat code history well often result in disappointing conclusion to this process.
                    • A culture of fixup commits means that git bisect is less likely to work well. If there are a lot of fixup commits, it’s more likely that any given commit won’t build or pass tests. This means that commit likely needs to be skipped while running git bisect. One or two of these isn’t the end of the world, but if there are a lot of them, it gets annoying and makes using git bisect harder because it can’t narrow down where the problem is as precisely.
                    • It helps with code review enormously, especially in a team environment. At $work, we have guidelines for commit history. Things like, “separate refactoring and new functionality into distinct commits” make it much easier to review pull requests. You could make the argument that such things should be in distinct PRs, but that creates a lot more overhead than just getting the commit history into a clean state. Especially if you orient your workflow with that in mind. (If you did all of your work in a single commit and then tried to split it up afterwards, that could indeed be quite annoying!) In general, our ultimate guideline is that the commits should tell a story. This helps reviewers contextualize why changes are being made and makes reviewing code more efficient.

                    (The fact that cleaning up history is such an ingrained part of your work flow doesn’t necessarily mean anything good. You might also just be used to it and wasting your time. You could argue that it is so easy that it’s worth doing even if there is no benefit. Maybe that’s true. Doesn’t seem like it to me at this point.)

                    Well, the charitable interpretation would be that I do it because I find it to be a productive use of my time. Just like I find making code comprehensible to be a good use of my time.

                    And no, clean source history of course requires dedicated effort toward that end. Just like writing “clean” code does. Neither of these things come for free. I and others do them because there is value to be had from doing it.

                    1. 1

                      Thanks, this is more useful for discussing. So from my experience (in the same order):

                      1. I could see this as being useful. I simply always used the project roadmap + issue tracker for that.
                      2. Absolutely, I wasn’t trying to argue against good commit messages.
                      3. I understand that fix-up commits can be a bit annoying in this respect so if you can easily avoid them you probably should. On the other hand I need git bisect only very rarely and fix-up commit are often trivial to identify and ignore. Either by assuming they doesn’t exist or by ignoring the initial faulty commit.
                      4. I am totally in favor of having refactoring and actual work in separate commits. Refactorings are total clutter. Splitting a commit which has both is a total pain (unless I am missing something) so it’s more important to put them into separate commits from the start.

                      I mean, maybe this is just too colored by how difficult I imagine the process to be. These arguments just seem too weak in comparison to the cognitive burden of knowing all the git voodoo to clean up the history. Of course if you already know git well enough that trade-off looks different.

                      1. 1

                        The git voodoo isn’t that bad. It does take some learning, but it’s not crazy. Mostly it’s a matter of mastering git rebase -i and the various squash/fixup/reword/edit options. Most people on my team at work didn’t have this mastered coming in, but since we have a culture of it, it was easy to have another team member hop in and help when someone got stuck.

                        The only extra tooling I personally use is git absorb, which automates the process of generating fixup commits and choosing which commits to squash them back into automatically. I generally don’t recommend using this tool unless you’ve already mastered the git rebase -i process. Like git itself, git absorb is a convenient tool but provides a leaky abstraction. So if the tool fails, you really need to know how to git rebase yourself to success.

                        It sounds painful, but once you have rebase mastered, it’s not. Most of my effort towards clean source history is spent on writing good commit messages, and not the drudgery of making git do what I want.

                        It sounds like we are in some agreement on what’s valuable, so perhaps we were just thinking of different things when thinking about “clean source history.”

                        Splitting a commit which has both is a total pain (unless I am missing something) so it’s more important to put them into separate commits from the start.

                        Indeed, it is. Often because the code needs to be modified in a certain way to make it work. That’s why our commit history guidelines are just guidelines. If someone decided it was more convenient to just combine refactoring and semantic changes together, or maybe they just didn’t plan that well, then we don’t make them go back and fix it. If it’s easy to, sure, go ahead. But don’t kill yourself over it.

                        The important bit is that our culture and guidelines gravitate toward clean history. But just like clean code, we don’t prioritize it to the expense of all else. I doubt few others who promote clean code/history do either.

                        N.B. When I say “clean code,” I am not referring to Bob Martin’s “Clean Code” philosophy. But rather, just the general subjective valuation of what makes code nice to maintain and easy to read.

        2. 5

          For a stacked diff flow, this is necessary https://jg.gg/2018/09/29/stacked-diffs-versus-pull-requests/

          1. 4

            If you are just going to duplicate your original commit message for the new work, why not amend the original commit? Branches are yours to do with as you please until someone else may have grabbed the commit.

            1. 2

              Sure, amend, squash, modify before you push

              It’s not about push. It’s about sharing. I push to many branches that aren’t being or intended to be shared with others. Thus, it’s okay to rewrite history and force push in those cases.

            2. 1

              This is the default force push method in Fork.

              1. 2

                What’s Fork?

                1. 2
              2. 1

                As I’ve written about in my own post about this there are caveats to be careful about - you need to add an =$ref to actually prevent overwriting, if you git fetch often