1. 10
  1. 5

    I produce stacked PRs constantly and I think they’re valuable. I love that we’re seeing more and more tooling to support them.

    But I think articles like this are pretty one-sided and don’t acknowledge the risks and costs of stacked PRs. They aren’t 100% positive with zero downsides. So here are some reasons you shouldn’t go all-in on stacked PRs.

    1. They often replace the “500 lines = looks good” problem with the “10 lines = looks good” problem. I can split a large change up into a sequence of small pieces, each of which is a correct, self-contained piece of code that quickly passes review. They can even be reviewed in parallel by different reviewers to really maximize my velocity. And when I’m done, I will have solved the problem in an awful way that makes no sense when you look at it as a whole. No reviewer could evaluate the whole change because I doled it out one tidbit at a time.
    2. They take more work to produce. The article makes the point that reviewer effort goes up more than linearly with increases in PR size, which may be true. But author effort goes up more than linearly with decreases in PR size. Breaking your 500-line change into fifty 10-line changes will probably get you lightning-fast reviews, but is it worth an extra two days of your time to save two hours of your reviewers’ time? (You might not even notice, because splitting up the change feels like two days of useful, productive work, whereas twiddling your thumbs for a couple extra hours feels frustrating and useless.)
    3. They can cause a feedback loop where the cultural norm shifts toward ever-smaller changes without any regard to the impact on high-level review quality or author effort.
    4. They can mask underlying problems with performance evaluation and prioritization. This one is a little fuzzy but I’ve seen it in real life. If a team treats code review as a first-class responsibility on a cultural level, and the company rewards timely, thoughtful code reviews with promotions and raises, the “my 500-line change waited for review for over a week and then got rubber-stamped” problem basically never comes up. And as an added benefit, code quality and knowledge sharing goes up because people take code review very seriously. But on most teams, stacked PRs are a technical hack to cope with the organizational problem that code review isn’t truly valued and is a distraction from the work that is truly valued.
    1. 1

      Exactly this.

      Anyone who hates stacked PRs is the because they aren’t using tools like git-machete. After I discovered this, it literally opened the workflow right up. It’s all I use now.

      1. 1

        I think 10 lines of code is really the issue here. It is generally speaking not large enough to tell a cohesive part of a narrative. I think the goal is generally around 100 to 200 lines, but that is heavily heavily context and complexity dependent.

      2. 3

        I think a problem with git is that commits should be able to contain subcommits. Then branches would just be the names for mega-commits that contain other normal commits. People ends up either doing “squash merges” (which TFA notes aren’t actually merges) or “semi-linear history” (keeping in the useless no-ff merge commits as branch markers) because there’s no way to say “branch main swallowed feature-1 branch then feature-2 branch then…”

        From the git koans:

        “Master Git,” said the historian, “what is the nature of history?”

        “History is immutable. To rewrite it later is to tamper with the very fabric of existence.”

        The historian nodded, then asked: “Is that why rebasing commits that have been pushed is discouraged?”

        “Indeed,” said Master Git.

        “Splendid!” exclaimed the historian. “I have a historical record of a merge commit with two parents. How can I find out which branch each parent was originally made on?”

        “History is ephemeral,” replied Master Git, “the knowledge you seek can be answered only by the gods.”

        The historian hung his head as enlightenment crushed down upon him.

        If branches were just mega-commits containing fractal sub-commits, then history wouldn’t have to be both immutable and ephemeral.

        1. 1

          Isn’t this exactly why we have merge commits?

          1. 2

            I thought about this after I posted, but no, they aren’t a good substitute. The problem is that any scale whatsoever, you want linear history. Multiple parents—even just two!—make the history unreadable, so basically everyone ends up squashing or rebasing or doing semi-linear history, so that they can do effective reviews and blame. Maybe it’s just a presentation problem, and it would be okay if git GUI clients just automatically showed commits grouped together after a merge, but I suspect it goes deeper conceptually.

            1. 1

              Linear history is an anti-pattern, IMO, and defeats the purpose of git. If I wanted enforced linearity I’d still be on SVN

              1. 2

                Maybe it is, but humans experience time linearly and are really invested in that perspective and are not really in the mood for discussions about Lamport’s Clocks paper when you, however reasonably, point out that they’re really experiencing concurrent events from a non-canonical point of view.

        2. 2

          That’s why I love Gerrit - stacked reviews are a natural way to work with it. You can have one commit per review. It might seem like a limitation, but given you can easily push branch as a series of related reviews, it works really well. My only gripe with Gerrit is it requires you to keep Change-Id trailer in the commit message.

          1. 2

            I don’t see the problem. If the change I make turns out to be big, I split the “narrative” into steps and put each one into its own commit, so the branch ends up with two or three of them. The reviewer can follow the story by starting at the first one and reviewing them in order.

            Sometimes I do use “stacked branches” though. When I need the code of the first for the next ticket/feature/task. This is only for me, so that I can continue with my work. By the time the second PR gets to the reviewer, the first should have been merged already and the reviewer would never know it started out as a stacked on the first, because I would rebase before submitting it.

            1. 1

              The issue is the code review interface and unit. Thinking about a tool like github PRs, it is extremely in convent to follow the flow of commits. It is really just made to show you the final diff. Generally speaking the finally diff is too large and not focused enough. As such, you are stuck reviewing something with the problems mentioned in this article unless you dis. Lot more manual work to look at each individual commit. That said, even if you look at each individual commit, you can’t leave review comments in them you have to go back and leave comments in the full PR, where context is lost.

              This is definitely why I prefer stacked diffs/commits as the review unit instead.

            2. 1

              I love stacked PRs as a mechanism to work on separate yet dependent pieces of work.

              However, this post (and many of the comments in this discussion) seem to enjoy using stacked PRs to phase pieces of a single piece of work:

              Alternatively, big PRs are created in an attempt to keep hacking on a feature. In order to implement a button on the frontend, we have to write the frontend code. And the backend code. And the API layer glue. And the database schema migration. And now our PR for a single button is massive.

              Put them individual commits! A PR is a linear narrative sequence of commits!

              Is the problem that (1) most people use Github and (2) that Github’s interface for moving through individual commits isn’t up front and obvious?

              1. 1
                1. 1

                  I love stacked PR like everyone else. It shows you step-by-step how a problem can be solved. But as time goes by, I increasingly find there are extra work for stacked PRs, and wondering if there are better ways. Let me give you an example of how I work:

                  1. I started by doing it end-to-end to verify for a sub-problem, or happy-path, or toy-example, whatever you call it, worked. This will be my “WIP” or “RFC” PR;
                  2. After people looked over, I start to refine that “WIP” PR, handling more edge cases, run fuzzers, make sure the dependencies I introduced is sensible, write more unit tests to cover my ass;
                  3. Break down what I have in 2 into several “stacked PRs”, maybe data models first, then executors / services, and then hook it up to the rest. These PRs will reference back to the previous “WIP” PR to give people an overview.

                  Reviewers obviously are extremely happy about this. However, moving from 2 to 3 is a lot of work (probably in itself would take a day to write good commit message, split files etc, without considering the back-and-forth incorporating review feedbacks). On the other hand, my mental model cannot go directly from 1 to 3 (skipping 2) somehow (it just doesn’t work for me, I cannot have a good idea everything is in the right place without seeing it in the right place first). Git probably could be one of the culprits why there are more work than needed though.

                  1. 2

                    Note: this depends a lot on what you consider a PR. When I say PR here, I am thinking about something like a GitHub PR with many commits in it.

                    I definitely prefer stacked diffs/stacked commits. They are a lot easier to manage with the right tools and each diff maps directly to a code review. With stacked PRs, you have multiple commits that make up 1 PR. This leads to more things to mess with/go wrong. It also makes reordering and splitting messier in my experience.

                    That all being said, stacked diffs simply are not supported that well on sites like GitHub. To get that, each PR would be a single commit that you always amend and then force push. That would be much worse than stacked PRs.

                    1. 1

                      It’s more work to produce stacked PRs, no question about it.

                      One thing I’ve found helps is to start off with a stack of empty changes from the get-go. For a given kind of coding task, I can usually take a pretty good guess at how I’ll ultimately want the stack to be organized. Then as I work on the problem, I am constantly switching branches so that I introduce a given piece of code in the right step in the sequence. I don’t always guess the stack structure exactly right, but it’s much easier to split a particular stack entry in two as soon as I see it’s needed than it is to wait until the end and split up the entire change all at once.

                      This gets much more feasible with a tool like Graphite that automates the “reparent all the child changes onto the latest version of the ancestor” process; otherwise you’re constantly having to run git rebase and it gets error-prone and hard to keep track of.

                      Even with good tools, you still have the added overhead of having to think about where in the stack to put a given piece of code. But doing it this way ends up being less total effort than splitting a change up after the fact, in my experience.