1. 16
  1.  

  2. 25

    Merging instead of rebasing doesn’t save you from creating a bad merge commit either, with or without merge conflicts. Whether you rebase or you merge, the final commit on top (i.e. the final snapshot of your files) will be in the same state. You’d get the same merge conflicts whether you rebase or you merge (possibly in a different order). It’s still your responsibility either way to make sure this commit is semantically correct – git doesn’t know your programming language and line-oriented diffs, whether by merging or by rebasing, can be wrong.

    The article is also making the case that having the merge commit instead indicates that a whole batch of commits introduced a bug. You can just look at the merge commits and know which merge commit was the problem. This still doesn’t indicate which commit in the batch was buggy. So you have the same problem, except it’s more swept under the rug. Throwing out a whole series of commits because of one bad commit in the batch seems like too much baby bathwater to me.

    1. 12

      Another reason I stay away from merge commits is the train track wreck graph of history. Try running git log --graph --oneline on any google project (chromium, etc) and trying to sort out the history visually. Often times I find the tracks completely fill up my terminal and I have to scroll to the right in order to see what the commits are.

      1. 7

        Merging instead of rebasing doesn’t save you from creating a bad merge commit either, with or without merge conflicts. Whether you rebase or you merge, the final commit on top (i.e. the final snapshot of your files) will be in the same state.

        That’s correct but it does make the merge point more obvious and explicit, which if the author is to be believed, makes it easier to untangle subtle errors of the type under discussion.

        I don’t buy it though - I’ve been using Git as a release engineer for years and as an IC for years more and I can think of maybe 1 instance where such a bug was introduced but not immediately caught by the developer doing the merging.

        That said one person’s experience does not make a thing true, so I’d be curious as to whether others have been bitten hard by this kind of subtle rebase induced bug?

        1. 4

          I’ve had similar problems. Write some code, thoroughly test it, merge and commit. Later a problem is discovered. Didn’t I test for this? Unfortunately it’s not possible to recreate the exact artifact that was previously tested.

          Features interfere in complex ways. After rebase you can no longer untangle this feature from all the features in its new base.

          1. 3

            Unfortunately it’s not possible to recreate the exact artifact that was previously tested.

            This in no way contradicts your point, which I appreciate you chiming in with - but best practice with Git whenever you want to freeze a point in time is to use tags.

          2. 0

            My experience has been very close to this author’s. I’ve also run into issues with git blame when a feature is rebased, effectively hiding the true author of the code. Over my many years of git use in various sized organizations I’ve come to join Paul Stadig’s philosophy of “Thou Shall Not Lie”.

            It’s easy to customize a git worflow to speed along deployments and releases. Git’s first and most important role is the safety-net. Keeping your history accurate is the safest way to keep that safety-net strong.

            If you have challenges with git log when you have many branches, there are plenty of tools to help visualize that. Monorepos also make all of this much more complicated (i’m not a fan).

            1. 8

              I’ve also run into issues with git blame when a feature is rebased, effectively hiding the true author of the code

              Rebasing a feature branch doesn’t hide its author.

              1. 0

                It can when the history is rewritten by another user, especially when squashing. Squashing commits leaves only the author of the base, hiding both the other contributors and the identify of the squasher.

                1. 2

                  Nope. Squashing a feature branch on a base only allows you to squash the commits of the feature branch into one another. It doesn’t allow you to squash them into the base commit. To lose commit authorship information, you’d need to very deliberately go outside normal rebasing commands like git rebase master.

                  1. 1

                    Squashing multiple commits into one is exactly what I’m decribing. If a feature branch has commits from multiple authors then when it is squashed to one commit only one author is listed. Whatever the “base” is that you’ve squashed down to is the remaining author unless you override it.

                    More importantly, the purpose of my comment is to answer the above comment’s question “… I’d be curious as to whether others have been bitten hard by this kind of subtle rebase induced bug?” My answer is yes, and now I avoid the situation entirely by the choice to merge and keep accurate history.

                    1. 1

                      If a feature branch has commits from multiple authors

                      Is this a regular occurrence? It certainly isn’t in any team I’ve ever worked in. If you regularly have to deal with feature branches where multiple people are committing, that points to a different issue: the team isn’t breaking up their work properly into discrete chunks.

                      the purpose of my comment is to answer the above comment’s question “… I’d be curious as to whether others have been bitten hard by this kind of subtle rebase induced bug?”

                      I think you really answered a different question; imho the original question was about when a bug might have been introduced, not who might have committed it:

                      it does make the merge point more obvious and explicit, which if the author is to be believed, makes it easier to untangle subtle errors of the type under discussion.

                      1. 1

                        Is this a regular occurrence? It certainly isn’t in any team I’ve ever worked in.

                        We all work in different environments and on different types of projects. Yes, this happens quite frequently in my industry.

                        The awesome power of git is that we can do things in different ways to meet our individual needs. There doesn’t have to be a single “right” way.

        2. 21

          I scrolled until I saw this. Would have been nicer to just put this at the top so that folks can stop reading sooner. If you’re going to tell the people you disagree with that they’re being unreasonable, then why am I going to listen to you?

          I’ve come to the conclusion that it’s about vanity. Rebasing is a purely aesthetic operation. The apparently clean history appeals to us as developers, but it can’t be justified, from a technical nor functional standpoint.

          Personally, I regularly benefit from clean history. Everything from writing release notes, to bisecting to linking to coherent reasonably self contained changes that can be easy(ier) to review.

          I mean, like, I guess our experiences are just so different that they can’t be reconciled.

          1. 1

            I think that this statement is potentially not expressing enough nuance. You can certainly have a “clean history” that provides the benefits of making release notes, bisecting, etc. while merging. Clean history is a fairly vague term in my experience reading these things anyway.

            I don’t think this article makes a sufficiently convincing argument however. I think the real difficulty the author encountered around a bad rebase is a more interesting avenue of discussion to explore.

            1. 4

              You can certainly have a “clean history” that provides the benefits of making release notes, bisecting, etc. while merging.

              I truthfully do not see any practical way this can be achieved if git rebase (or its derivatives) is not allowed to be used at all. You wind up needing to merge master back into your branch, which destroys clean history.

              Another variant is to git rebase branches, but use merge commits to bring them into master. It is plausible to get fairly clean history this way, but it still requires using git rebase on branches whenever there’s a conflict with master.

              From reading the OP’s article, it sounds like they are completely against git rebase at all levels. They even acknowledge that this gives up clean history. The problem I have with the OP is them saying that folks only want clean history because of their “vanity,” which is just a big steaming pile of bullshit. If you aren’t going to bother at least trying to accurately characterize your opponent’s position, then don’t bother writing a persuasive essay in the first place.

              “Clean history” to me is less about a specific end state and more about a process. It means that the structure of commits in your repository gets as much care and attention as the structure of your code. Clean history means that commits aren’t just about putting your code into source control. Clean history means that one endeavors to logically structure commits in a way that can be read and understood by humans at a later date. Using git rebase somewhere is a necessary but not sufficient technique for achieving this with git. Notice that the OP criticizes use of git rebase specifically in scenarios where “clean history” as a methodical process doesn’t appear to be adhered to.

              “Clean history” is just as vague as “clean code,” but we all know it when we see it.

              1. 1

                I’m going to jump around here a little.

                “Clean history” to me is less about a specific end state and more about a process. It means that the structure of commits in your repository gets as much care and attention as the structure of your code. Clean history means that commits aren’t just about putting your code into source control. Clean history means that one endeavors to logically structure commits in a way that can be read and understood by humans at a later date.

                I strongly agree with you here. This is what I am aiming for.

                “Clean history” is just as vague as “clean code,” but we all know it when we see it.

                I don’t agree with you here, because I think that I have a clear idea in my mind, there’s a sufficient amount of conflict around what it ends up meaning to be clean. I think it’s better to describe goals, for example:

                • Humans are aided in review by the shape and size of the commit
                • Blame provides meaningful context to code
                • Bisect is functional
                • Easily get a high-level overview of what’s changed for producing release notes

                These are the ones currently in my brain, so it’s what we’re aspiring to. I think others include some sense of a “flat” history here, but I think that leans towards aesthetics (or lacking knowledge of a flag or two in git to counterbalance the merges).

                I truthfully do not see any practical way this can be achieved if git rebase (or its derivatives) is not allowed to be used at all. You wind up needing to merge master back into your branch, which destroys clean history.

                I’m not sure I understand your point here, and that potentially points at a discrepancy between definitions of clean history between us. I agree that seeing regular merge commits from master into a feature branch creates confusing histories, man gitworkflows actually has a section on this:

                   Example 3. Merge to downstream only at well-defined points
                
                   Do not merge to downstream except with a good reason: upstream API
                   changes affect your branch; your branch no longer merges to upstream
                   cleanly; etc.
                
                   Otherwise, the topic that was merged to suddenly contains more than a
                   single (well-separated) change. The many resulting small merges will
                   greatly clutter up history. Anyone who later investigates the history
                   of a file will have to find out whether that merge affected the topic
                   in development. An upstream might even inadvertently be merged into a
                   "more stable" branch. And so on.
                

                However, you can just merge into master from the topic branch later, you do not need to rebase before doing so, and you can resolve the conflicts as part of the merge commit. One upside of this approach is that git bisect is able to distinguish between bad integration and bad code, as the bisect will point at the merge commit rather than the commit which introduced the changes. Another upside is that you can revert whole features by reverting the merge commit. You can still produce release notes by using git log --first-parent. The commits are still broken up to be useful for reviewing and git blame.

                That’s how I can create what matters to me out of merge commits (and gain a few extra features like reverting whole topics easily). But maybe something else matters to you? I’d be curious to hear if my list misses something.

                1. 2

                  I think it’s better to describe goals

                  Sure. That’s what I do at work. And your goals look good to me. But this is the Internet and not everybody agrees about these details, and I didn’t want to bother getting bogged down in the muck about this.

                  But maybe something else matters to you? I’d be curious to hear if my list misses something.

                  man gitworkflows is interesting, but is really out of place when development is fast paced with fewer stable API boundaries. It’s not that uncommon to be doing some work, and then want to bring newer changes from master into your branch. Examples range from simple stuff like fixing a CI bug to completing work that is required for you to make progress. The simplest way for people to do that without rebasing is to git merge master. Moreover, “just waiting to merge into master” instead ignores common workflows on GitHub, where merging to master isn’t done manually. So how do you resolve conflicts on your branch without merging master back into it?

                  On top of all of that, it is very common to push commits to a branch in a PR that later turn out to be wrong or have other nominal issues with them. Without rebasing, you wind up with lots of commits like “fix docs” or “fix lint” or “fix failing test” or “respond to review feedback.” All of those probably should get squashed back into their appropriate commit. Otherwise, you wind up with “unclean” history, and you increase the probability of having commits that need to be skipped when you bisect. Some of this can be avoided through liberal use of pre-commit hooks, but those can be annoying on their own, and they don’t fix everything.

          2. 9
            1. git rebase -i -x "make test" <target> while you rebase. True, git bisect should never report a false positive, but I like to actually be sure (side-note: I’m working on getting my CI to verify all commits in a branch).
            2. rebase merge-conflicts are often smaller, just more numerous. Personally, I find several small merge-conflicts easier to deal with than one big one.
            3. I rebase so I can guarantee the merge result is identical to my branch HEAD. This means I can cut builds from a branch, and if the merge is approved, promote that build. Which also means I make a speculative deploy of the branch, and report any issues from that before I merge it in.

            Finally, I rebase because I’m telling a story. I don’t want the commit history to exactly represent what I did; I want history to show what I intended to do. When I do git blame, I want to see “make this change for reason X”, not “address review comments”.

            P.S.: There are other ways to achive #3, but then I’d argue for merge --squash.

            1. 8

              Consider the case where a dependency that is still in use on feature has been removed on master. When featureis being rebased onto master, the first re-applied commit will break your build, but as long as there are no merge conflicts, the rebase process will continue uninterrupted. The error from the first commit will remain present in all subsequent commits, resulting in a chain of broken commits.

              Merge commits do not solve this in any way. If master has removed a dependency, no amount of git gymnastics will bring it back magically. Git cannot possibly understand the dependencies in your code (it’s just a version control system), so merging and rebasing will have the same effect.

              In this case, we hope that Git identifies commit f as the bad one, but it erroneously identifies dinstead, since it contains some other error that breaks the test.

              That’s because you re-added the dependency in a commit G that was appended to the end of the tree. You should have re-added the dependency immediately after origin’s tip before your commits are rebased. That is, the order should’ve been ABCGDEF.

              You pretend that the commits were written today, when they were in fact written yesterday, based on another commit.

              Rebase does nothing like that. Git has two dates: commit date and author date. Commit date is updated when rebased, author date is kept intact (unless you go out of your way to specify --ignore-date).

              You’ve taken the commits out of their original context, disguising what actually happened.

              Context is perfectly preserved. I genuinely don’t understand what context is added by adding a meaningless merge commit saying “this is when I merged this branch” in a separate commit when you can derive that information from the commit date of the last commit (not author date, mind you).

              Can you be sure that the code builds?

              Once again, merge commits don’t solve this in any way.

              I’ve come to the conclusion that it’s about vanity. Rebasing is a purely aesthetic operation. The apparently clean history appeals to us as developers, but it can’t be justified, from a technical nor functional standpoint.

              Not true, I use rebase and cherry-pick liberally because it helps when I go through the commit log at a later date (which I do). It helps when I write release notes because everything is linear. It helps fixing merge conflicts at a commit-level, rather than one huge dump of changes to resolve them in a merge commit. Commits should be atomic and self-containing.

              If that puts you off you might be better off using a simpler VCS that only supports linear history.

              Stop patronising.

              1. 5

                I was sure this was submitted before. Url changed.

                https://lobste.rs/s/tos2zx/why_you_should_stop_using_git_rebase

                1. 1

                  According to Word of Mod, reposts this old are OK:

                  https://lobste.rs/s/j2wdwl/sourcehut_hacker_s_forge#c_akwblr

                  1. 1

                    “repost” tag?

                2. 4

                  I see a lot of articles arguing the relative merits of rebase vs merge, and it makes me think that there’s a fundamental UX problem with git. What do mercurial and fossil do instead, and do they have similar problems?

                  1. 2

                    It’s a combination of “there are two types of things: those that people don’t complain about and those that people use” and “the simple discussions will be discussed the most”.

                    1. 1

                      For a while mercurial didn’t support a rebase like operation. That’s still the way I use it, but I think there’s history editing options now.

                      1. 2

                        IIRC, the mercurial history editing safeguards are covered under phases.

                      2. 1

                        The thing that I think is missing in git is the ability to see where a branch has “been”, that is, which commits have ever been at the tip of a given branch. I’d like to be able to see the master branch as a straight line, with other tracks branching off and merging back. But I don’t think that information is first-class in git, and is distributed across the reflogs of all the clones; no one clone has the full information necessary to understand the working history of a branch. That’s something that a different SCM might support, for all I know.

                      3. 3

                        I don’t use rebase at all because I prefer an explicit history over the altered timeline you get with rebasing. I know so many other folks who prefer it the other way, though. I think it’s ultimately up to the team running the project on how they want to do it. It feels like tabs vs spaces to some degree.

                        1. 3

                          We use phabricator with git at my current job and all commits are squashed and rebase. I love the clean history, something I never saw before in a company.

                          The only thing I do not like is that bugs get fixed on develop first and then cherry-picked onto release branches, which I find weird.

                          All that being said, I still like git rebase and I can recommend enabling “rerere” for long running branches that use rebasing.

                          1. 2

                            Cool, I didn’t realize phabricator did that. That’s a default I very much can agree with. We use GitHub at work and there’s a Squash & Merge option … but I can’t force people to use it :-)

                            Re: bugs getting fixed on develop first, I suppose that’s to ensure they stay fixed going forward. Like, you don’t want to accidentally forget to merge your current release branch (with its bugfixes) into develop before you cut the next release branch. With a develop-first bugfix strategy, there’s no chance of that happening. You literally have to do the cherry-pick if you want to actually deploy the fix.

                            1. 1

                              I could follow that reasoning for faster paced releases, but we make enterprise software and often bugs are on releases from 8 months ago. develop has changed a lot since then and the bug may not even be there any more.

                              In all of my dev career (15 years), I have always fixed bugs on the release first and then merged it to develop, if applicable, never the other way around.

                            2. 1

                              The only thing I do not like is that bugs get fixed on develop first and then cherry-picked onto release branches, which I find weird.

                              Isn’t this just a trunk-based development?

                            3. 2

                              I was thinking it would be cool to be able to have one level of commits which are immutable in practice, then a second level which annotates the code changes without messiness which wouldn’t be of much interest. Anyone aware of such a system?

                              I suppose release notes serve the purpose to some degree but it would be cool to have it integrated into the version control system.

                              Or maybe it’s just extra complexity and useless burden for the programmer for not much gain…

                              1. 3

                                Isn’t this what a merge commit is? If you merge, you have a place to write all about this new feature. If you merge but don’t commit, you can even run tests and fix regressions before you commit.

                              2. 2

                                I think you should keep your history true. Get comfortable with tools to analyse it, and don’t fall for the temptation to rewrite it.

                                This is a misapprehension of what commit “history” is. Working with and on your commit graph can be incredibly powerful, especially in complex projects with lots of branches, multiple upstreams, developers pushing commits to each other, etc. etc. You can only be concerned about keeping history ‘true’ if you consider rebasing purely a matter of aesthetic.

                                1. 1

                                  I get it, rebase is not for everyone. Sometimes it can get tricky and people don’t want to learn all the tricks. That’s OK. Another thing that people can do though is use GitHub’s ‘Squash and Merge’ option for pull requests (or similar option for other tools). This squashes your branch’s commits together into a single commit and then merges that one commit in, using a fast-forward merge (so no merge commit). It effectively does the same thing, just with I guess less headache for most people. And yes, this effectively wipes out the feature branch’s internal history–which I don’t feel is a huge loss. Besides, you could always keep the original feature branch around in the repo for some time after the merge is done.

                                  1. 2

                                    On my team at work, our general policy is “only do a merge commit if your commits tell a story” (e.g., no “WIP” or “updated a thing” commits). Otherwise we squash-rebase.

                                    1. 1

                                      That’s reasonable, in my team I generally use a merge commit to merge in release branches into our integration branch before starting a new sprint. The merge commit I find makes it totally clear that all release branch work has been fully integrated.

                                  2. 1

                                    What motivates people to rebase branches? I’ve come to the conclusion that it’s about vanity.

                                    When possible, I like to make a one-commit PR, so that git blame shows a single cohesive change with a single, clear commit message explaining it. I do a bunch of messy commits while working, then squash into one before making my PR.

                                    I doubt anybody looks at my commit history to judge me, so there’s no vanity at play here, IMO.

                                    When it’s time to merge, I do not want a squash merge, because that makes git branch --merged useless.