2005: git is released, and Junio Hemano becomes the core maintainer
2008: the first thread about staging on the linked article, and, GitHub is formed
2021: 13 years later, this is still a thing
There’s something about “we can’t change this, what about all the people using this” in the early days, becoming an issue for far far longer and for far many more people, that feels like a failure mode.
I’m reminded of the anecdote about make: Its original author used tab characters for indentation in Makefiles without much thought. Later when they decided to add make to UNIX, they wanted to change the syntax to something more robust. But they were afraid of breaking the ten or so Makefiles in existence so they stuck with the problematic tab syntax that continues to plague us today.
There were several hundred kilobytes of existing C source code in the world at the time. SEVERAL HUNDRED KB. What if you made this change to the compiler and failed to update one of the & to &&, and made an existing program wrong via a precedence error? That’s a potentially disastrous breaking change.
…
So Ritchie maintained backwards compatibility forever and made the precedence order &&, &, ==, effectively adding a little bomb to C that goes off every time someone treats & as though it parses like +, in order to maintain backwards compatibility with a version of C that only a handful of people ever used.
I think this article includes a logical fallacy. It assumes that whatever you’re doing will be successful, and because it is successful, it will grow over time. Since it will grow over time, the best time for breaking changes is ASAP.
What this logic ignores is that any tool that embraces breaking changes constantly will not be successful, and will not grow over time. It is specifically because C code doesn’t need continual reworking that C has become lingua franca, and because of that success, we can comment on mistakes made 40+ years ago.
For me it’s because most editors I’ve used (thinking Vim and VSCode) share their tab configs across file types by default.
So if I have soft tabs enabled, suddenly Make is complaining about syntax errors and the file looks identical to if it was correct. Not very beginner friendly.
I always make sure to have my editor show me if there are spaces at the front of a line. Having leading spaces look the same as r trb is a terrible UX default most unfortunately have
I remember this person’s previous airing of dirty laundry in the git community and felt the whole thing was a big waste then, but for some reason this time around I’m here for the dish. The author seems like a nightmare (what other kind of person lists out, in extreme detail, the number of people that agree with them?), but it makes for an amusing read if you disconnect yourself from the outcome.
The author seems like a nightmare (what other kind of person lists out, in extreme detail, the number of people that agree with them?
I get where you’re coming from, but in this case, I don’t think so.
The author’s argument is not “I have an opinion, it’s right” but “I have an opinion which nearly the entire community shares, save one key person”. If you want to make an evidence-based case for that argument, you pretty much have to do something like the accounting that he did. Or appeal to general polling results – and you’ll notice the article links to such a poll, presumably to collect such evidence.
My good faith question to you would be: What other tactic do you think he should have taken, given the history here, and the case he wants to make?
It’s possible to make a case for consensus without framing it so directly as, “and everybody agrees with me,” which the constant back-and-forth of the text quoting does. As the OP says in the other reply to my comment, “he definitely has an axe to grind and you can feel it.” I’m not in a position to make a positive claim that X tactic is the solution to this problem. My only statement is that whatever tone he’s adopting casts himself in a bad light.
Yeah, I’m slightly conflicted as the submitter. I think he definitely has an axe to grind and you can feel it, but it’s also a useful summary of one of git’s major UX issues and why it doesn’t get fixed, even if he’s framing it like he’s a victim here.
I wish the staging area didn’t exist. Hidden state that changes by itself is a recipe for accident in any man-machine interface, and in this case, quite unnecessary.
Wanna track, untrack or restore a file? Now it’s implicitly staged too, yet hidden from git diff, and inconveniently becomes part of the next commit, despite your explicit command to commit something else.
When I commit something, I want to see what I have (in git diff) first and then commit specific files (or --all or even --patch for parts of files). I never want to use the staging area. Can I turn it off?
When I commit something, I want to see what I have (in git diff) first and then commit specific files (or –all or even –patch for parts of files). I never want to use the staging area. Can I turn it off?
You can try to adapt your habits and configuration. E.g. you can configure an alias for git add -A && git commit [...]
(see https://stackoverflow.com/q/2419249/)
I wish the staging area didn’t exist.
I like it.
It allows for easily splitting the current changes into multiple commits. I often use git add -p to interactively select what should be part of the next commit while keeping everything else around. That can be temporary comments for what I’m going to work on next or just stuff that should be in a separate commit. Sometimes it makes sense to work on a set of changes as one work unit, but structure it differently into review units. To test & commit what is currently staged I can do:
git stash save --keep-index
do_my_tests
git commit -m "Fix foo"
git stash pop
Even when I want to commit all current changes, I like using git add -p to review them, especially if I have touched multiple files. “Changes in foo.ext? I just did those, stage the whole file (a). Changes in bar.ext? I did those an hour ago, let’s confirm those one by one (y)…”
(I also use a lot of aliases, so for me git add -p becomes g ap, git stash save --keep-index is g ss, git diff --patience --cached becomes g dc, and so on)
(I would prefer a consistent use of “staging area” and “staged” everywhere.)
I dunno. I’m much more experienced with git; which colours things, but I’ve never got myself into a jam I can’t figure out. I use mercurial for work and I’ve had weird situations I can’t explain where a file content doesn’t match what the commit diff says it should.
Yes, index/stage interactions with diff only comparing the working tree is one of the biggest git footguns. It makes me only stage files until the absolute end before I commit.
You can use --verbose flag to git commit to see the diff and list of files below the commit message (it will not be included in commit message). You can even make it default via git config --global commit.verbose true. This should help with such accidental commits.
… that is the only time one is meant to stage files? Does anyone have a sane workflow that involves running git add but not following soon by git commit?
I stage something (which could be as small as a single line) as soon as I’m happy with it, and commit when I’m happy with the change as a whole. Often I’ll write a rough version of something, stage it, and then rewrite it to be nicer - but with the safety net of being able to easily undo the change if it’s wrong.
I guess I could make small commits for this, but then I’d be making tonnes of tiny commits and squashing them before each push, which feels like a lot more work. I like each of my commits to pass tests, but these tiny bits I stage might not even compile.
Or I might make one large change, then stage and commit bits of it separately if there are multiple logical changes there.
It does help though that I use magit, which makes staging or unstaging things line-by-line in my text editor trivially simple. I certainly wouldn’t want to go back to using git add / git commit / etc in a terminal.
But according to documentation if you will do git commit <pathspec> then it will ignore staging area (named index there):
When pathspec is given on the command line, commit the contents of the files that match the pathspec without recording the changes already added to the index.
And if you want to be able to review what will be committed, then just use --verbose flag for git commit or set it to be default via git config --global commit.verbose true.
Thanks for explaining, that is good to know! Yes, I’ve noticed commit now obeys what you say, but it wasn’t always like that, and I didn’t know how to trust it, because I hadn’t seen that explanation.
I chose to describe the old behaviour, because it deserved mention as a contributing factor in this footgun. I can’t edit my old post anymore, but I can now delete this ingredient in my fear and distrust of the staging area.
What behavior are you suggesting git add should have? Are you saying this should record file names but not contents so a later commit just includes whatever’s there?
Git add should just git commit --amend to the last dummy commit until you give it a message. Stage is just this weird limbo between being a commit and not.
git add/rm should not even exist: They need to be atomic with the commit to avoid the limbo state, as you say, and that would make sense if they were options to that command: git commit --add/--rm.
And git checkout <file> should of course not stage files as a side-effect of restoring them.
Well, actually stage area is exactly that “dummy commit”… So it has exactly behaviour you want. I agree that git diff behaviour could be different, but with your approach it would behave exactly the same as it is doing right now.
Don’t distinguish at all, if you run git push with changes “staged” they get pushed.
Don’t distinguish at all, but don’t update the branch ref until you run the new and more minimal “git commit”. If you try and run git push with changes staged, there is no altered branch to push. (Probably add a new tag along the lines of branchname#staged so you can go back to it)
Just add a flag to the commit header, if you feel the need to distinguish (but why?)
“dummy commit” is a sidetrack in my opinion. python -c 'import this' | grep Explicit
git commit --add README.md # Creates a new commit
git commit --amend --add main.c # Adds to the same commit
If you ask me, I’m not talking about changing git add, but rather deprecating it and the whole staging area.
Sounds stateful (might defeat the purpose)
If you could have multiple “staged” commits on a branch, meaning unpushable, that would solve some real workflow issues, at least for me, like having to count how many commits not to push in order to push one commit (e.g. git push HEAD~42:master). No idea about implementing that flag, though.
git add amends the current commit if it has no message. If the current commit has a message, it creates a new commit with no message. git add --amend could be the command to edit a commit after its message has been set.
Stash is useful but basically shouldn’t exist as is. That is does is a flaw in git’s model. Instead git stash could just make a nameless commit on the current branch and then set some flag somewhere so that the next “cherry-pick” command will pick up the commit you just made. There can be some list of commits that were stashed that the tool can pick up on.
In many cases I do not need to, as these are for example minor configuration changes to make it work with my environment. In other cases these are minor documentation/comments. In other cases I split my work into separate parts after longer period without commit. In all of these cases I have CI that test my partial updates.
Regardless: it’s not supported, and not particularly missed. The staging area adds a bunch of unnecessary state and complexity to both the mental model of the repository, and the implementation.
It’s a feature that I don’t find pulls it’s weight.
I was just talking to someone recently about how git is often designed to expose all of its weird implementation details directly, at the expense of the actual use cases being used. This seems to reflect that.
I also don’t really buy the “we shouldn’t reinforce how git trainings are teaching it”. Maybe it can more a bit more towards embracing use cases that users find useful & want?
The git porcelain is a very, very thin wrapper of git’s own internal data model. Clearing up terminology could make it easier for people to understand git.
It matters because there’s a clear step to improve the documentation (of which Git’s is notoriously lacking already) and there’s no reason to specifically include a clarification around the term that all the tutorials use and thus that all new users learn it as.
If it were a matter of removing the term “index” from documentation then it would be bikeshedding, but we’re talking about adding the term “staging area” here. I doubt anyone would object to a permanent “index” alias for every relevant command, as long as there’s “stage”, too
It’s not authoritarian to think that people who aren’t engaging in good faith shouldn’t be running core tooling projects used by the entire software industry. Applying a fork doesn’t solve the issue that a toxic person is leading a massive community effort.
Furthermore, this isn’t about solving it for me – I know how to use git already. It’s about increasing accessibility for newcomers, who won’t know how to apply patches and recompile.
Where can I see a single example of engaging in bad faith, or any toxicity for that matter?
It could be argued that core tooling shouldn’t change at all, and a change like this would confuse the documentation, or break things. Though this has happened already with the master → main switch, as well as with some changes to the porcelain. git is rather bad from both viewpoints.
This is under the confusing of documentation. The more ways to do it, the more confused it is. Changing terms in any place would also be a source of confusion. I admit to not having read it in detail, but no miracle is possible.
I mostly just scanned it, early on found out it literally lies (“everyone” means “people I agree with”), and figured out it’s just someone publicly moaning, so not worth the attention.
And then there was this comment where someone disrespects other people’s work, of course.
Furthermore, this isn’t about solving it for me – I know how to use git already. It’s about increasing accessibility for newcomers, who won’t know how to apply patches and recompile.
Applying a fork doesn’t solve the issue that a toxic person is leading a massive community effort.
Sure it does: if you do better, people switch projects, and the origin of the fork stops being a massive community effort. How many Hudson developers are there today? How many Jenkins developers are there today?
Funny, I’ve never called the index “the staging area,” though I did somehow get into the habit of calling it “the stage,” and using the term “stage” as if it meant a staging area.
A quick, rough timeline:
There’s something about “we can’t change this, what about all the people using this” in the early days, becoming an issue for far far longer and for far many more people, that feels like a failure mode.
I’m reminded of the anecdote about make: Its original author used tab characters for indentation in Makefiles without much thought. Later when they decided to add make to UNIX, they wanted to change the syntax to something more robust. But they were afraid of breaking the ten or so Makefiles in existence so they stuck with the problematic tab syntax that continues to plague us today.
Your comment reminds me of the origin of C’s confusing operator precedence rules:
Eric Lippert’s blog post “Hundred year mistakes”
I think this article includes a logical fallacy. It assumes that whatever you’re doing will be successful, and because it is successful, it will grow over time. Since it will grow over time, the best time for breaking changes is ASAP.
What this logic ignores is that any tool that embraces breaking changes constantly will not be successful, and will not grow over time. It is specifically because C code doesn’t need continual reworking that C has become lingua franca, and because of that success, we can comment on mistakes made 40+ years ago.
Sure, but this error propagated all the way into Javascript.
I’m not saying C should have changed it. (Though it should.) But people should definitely not have blindly copied it afterwards.
I’m curious why you think it is problematic? Just don’t like significant whitespace? But make also has significant newlines…
For me it’s because most editors I’ve used (thinking Vim and VSCode) share their tab configs across file types by default.
So if I have soft tabs enabled, suddenly Make is complaining about syntax errors and the file looks identical to if it was correct. Not very beginner friendly.
IIRC Vim automatically will set hardtabs in
Makefile
s for you. So it shouldn’t be a problem, at least there (as long as you havefiletype plugin on
).I always make sure to have my editor show me if there are spaces at the front of a line. Having leading spaces look the same as r trb is a terrible UX default most unfortunately have
Thanks, I hate it
What was the problem with tab syntax?
+1 from me for the move to “staging”.
The status quo is a mess: “the index” is selected using
--cached
. Where’s the logic in that!?For at least four or five years,
--staged
is a synonym for--cached
. I was surprised to see that--cached
is not already deprecated.Linus called it the cache. That’s the logic.
Good grief, Junio sounds like Ulrich Drepper.
I remember this person’s previous airing of dirty laundry in the git community and felt the whole thing was a big waste then, but for some reason this time around I’m here for the dish. The author seems like a nightmare (what other kind of person lists out, in extreme detail, the number of people that agree with them?), but it makes for an amusing read if you disconnect yourself from the outcome.
I get where you’re coming from, but in this case, I don’t think so.
The author’s argument is not “I have an opinion, it’s right” but “I have an opinion which nearly the entire community shares, save one key person”. If you want to make an evidence-based case for that argument, you pretty much have to do something like the accounting that he did. Or appeal to general polling results – and you’ll notice the article links to such a poll, presumably to collect such evidence.
My good faith question to you would be: What other tactic do you think he should have taken, given the history here, and the case he wants to make?
It’s possible to make a case for consensus without framing it so directly as, “and everybody agrees with me,” which the constant back-and-forth of the text quoting does. As the OP says in the other reply to my comment, “he definitely has an axe to grind and you can feel it.” I’m not in a position to make a positive claim that X tactic is the solution to this problem. My only statement is that whatever tone he’s adopting casts himself in a bad light.
Yeah, I’m slightly conflicted as the submitter. I think he definitely has an axe to grind and you can feel it, but it’s also a useful summary of one of git’s major UX issues and why it doesn’t get fixed, even if he’s framing it like he’s a victim here.
I wish the staging area didn’t exist. Hidden state that changes by itself is a recipe for accident in any man-machine interface, and in this case, quite unnecessary.
Wanna track, untrack or restore a file? Now it’s implicitly staged too, yet hidden from
git diff
, and inconveniently becomes part of the next commit, despite your explicit command to commit something else.When I commit something, I want to see what I have (in
git diff
) first and then commit specific files (or--all
or even--patch
for parts of files). I never want to use the staging area. Can I turn it off?You can try to adapt your habits and configuration. E.g. you can configure an alias for
git add -A && git commit [...]
(see https://stackoverflow.com/q/2419249/)I like it.
It allows for easily splitting the current changes into multiple commits. I often use
git add -p
to interactively select what should be part of the next commit while keeping everything else around. That can be temporary comments for what I’m going to work on next or just stuff that should be in a separate commit. Sometimes it makes sense to work on a set of changes as one work unit, but structure it differently into review units. To test & commit what is currently staged I can do:Even when I want to commit all current changes, I like using
git add -p
to review them, especially if I have touched multiple files. “Changes in foo.ext? I just did those, stage the whole file (a). Changes in bar.ext? I did those an hour ago, let’s confirm those one by one (y)…”(I also use a lot of aliases, so for me
git add -p
becomesg ap
,git stash save --keep-index
isg ss
,git diff --patience --cached
becomesg dc
, and so on)(I would prefer a consistent use of “staging area” and “staged” everywhere.)
Agreed. Mercurial gets along just fine without it and is a much easier to use DVCS system.
I dunno. I’m much more experienced with git; which colours things, but I’ve never got myself into a jam I can’t figure out. I use mercurial for work and I’ve had weird situations I can’t explain where a file content doesn’t match what the commit diff says it should.
Yes, index/stage interactions with
diff
only comparing the working tree is one of the biggest git footguns. It makes me only stage files until the absolute end before I commit.You can use
--verbose
flag togit commit
to see the diff and list of files below the commit message (it will not be included in commit message). You can even make it default viagit config --global commit.verbose true
. This should help with such accidental commits.… that is the only time one is meant to stage files? Does anyone have a sane workflow that involves running git add but not following soon by git commit?
I stage something (which could be as small as a single line) as soon as I’m happy with it, and commit when I’m happy with the change as a whole. Often I’ll write a rough version of something, stage it, and then rewrite it to be nicer - but with the safety net of being able to easily undo the change if it’s wrong.
I guess I could make small commits for this, but then I’d be making tonnes of tiny commits and squashing them before each push, which feels like a lot more work. I like each of my commits to pass tests, but these tiny bits I stage might not even compile.
Or I might make one large change, then stage and commit bits of it separately if there are multiple logical changes there.
It does help though that I use magit, which makes staging or unstaging things line-by-line in my text editor trivially simple. I certainly wouldn’t want to go back to using
git add
/git commit
/ etc in a terminal.But according to documentation if you will do
git commit <pathspec>
then it will ignore staging area (named index there):And if you want to be able to review what will be committed, then just use
--verbose
flag forgit commit
or set it to be default viagit config --global commit.verbose true
.Thanks for explaining, that is good to know! Yes, I’ve noticed
commit
now obeys what you say, but it wasn’t always like that, and I didn’t know how to trust it, because I hadn’t seen that explanation.I chose to describe the old behaviour, because it deserved mention as a contributing factor in this footgun. I can’t edit my old post anymore, but I can now delete this ingredient in my fear and distrust of the staging area.
Well, it seems that it is there since git 2.0 so it is over 7 years now. So I would say, that unfortunately your post is more FUDy than technical.
You can also use commit -p to commit just part of your current diff and skip the index.
What behavior are you suggesting
git add
should have? Are you saying this should record file names but not contents so a later commit just includes whatever’s there?Git add should just
git commit --amend
to the last dummy commit until you give it a message. Stage is just this weird limbo between being a commit and not.git add/rm
should not even exist: They need to be atomic with the commit to avoid the limbo state, as you say, and that would make sense if they were options to that command:git commit --add/--rm
.And
git checkout <file>
should of course not stage files as a side-effect of restoring them.Well, actually stage area is exactly that “dummy commit”… So it has exactly behaviour you want. I agree that
git diff
behaviour could be different, but with your approach it would behave exactly the same as it is doing right now.You can’t git add some stuff and then switch branches without committing because it’s not a real commit.
Well, how would you differentiate between “dummy” and “real” commit then? How would you prevent it from being pushed to remote?
We have
git stash
that provide similar functionality to what you want.Many good options here:
git push
with changes “staged” they get pushed.git push
with changes staged, there is no altered branch to push. (Probably add a new tag along the lines ofbranchname#staged
so you can go back to it)git add
whether create new “dummy commit” or “append to existing dummy commit”?“dummy commit” is a sidetrack in my opinion.
python -c 'import this' | grep Explicit
If you ask me, I’m not talking about changing
git add
, but rather deprecating it and the whole staging area.Sounds stateful (might defeat the purpose)
If you could have multiple “staged” commits on a branch, meaning unpushable, that would solve some real workflow issues, at least for me, like having to count how many commits not to push in order to push one commit (e.g.
git push HEAD~42:master
). No idea about implementing that flag, though.Ad 1. It is how it works today (well, you do not need
--add
flag). I love staging area, I really like that concept and I miss it in other tools.git add
amends the current commit if it has no message. If the current commit has a message, it creates a new commit with no message.git add --amend
could be the command to edit a commit after its message has been set.Stash is useful but basically shouldn’t exist as is. That is does is a flaw in git’s model. Instead
git stash
could just make a nameless commit on the current branch and then set some flag somewhere so that the next “cherry-pick” command will pick up the commit you just made. There can be some list of commits that were stashed that the tool can pick up on.This is the approach I took in git9. It works well for me.
How do you manage partial updates? When for example I want to add only some changes and ignore others.
I don’t think this is a good workflow. How do you run tests against your partial updates?
In many cases I do not need to, as these are for example minor configuration changes to make it work with my environment. In other cases these are minor documentation/comments. In other cases I split my work into separate parts after longer period without commit. In all of these cases I have CI that test my partial updates.
Regardless: it’s not supported, and not particularly missed. The staging area adds a bunch of unnecessary state and complexity to both the mental model of the repository, and the implementation.
It’s a feature that I don’t find pulls it’s weight.
I was just talking to someone recently about how git is often designed to expose all of its weird implementation details directly, at the expense of the actual use cases being used. This seems to reflect that.
I also don’t really buy the “we shouldn’t reinforce how git trainings are teaching it”. Maybe it can more a bit more towards embracing use cases that users find useful & want?
This seems like such a weird thing to bikeshed about… I basically never have to refer to it so what does it matter what it is called?
The git porcelain is a very, very thin wrapper of git’s own internal data model. Clearing up terminology could make it easier for people to understand git.
It matters because there’s a clear step to improve the documentation (of which Git’s is notoriously lacking already) and there’s no reason to specifically include a clarification around the term that all the tutorials use and thus that all new users learn it as.
If it were a matter of removing the term “index” from documentation then it would be bikeshedding, but we’re talking about adding the term “staging area” here. I doubt anyone would object to a permanent “index” alias for every relevant command, as long as there’s “stage”, too
The terminology and lack of consistency mapping concepts to commands caused me so much confusion early on.
One tool I stumbled on that helped clear up a lot of confusion was NDP software’s Git Cheatsheet - no affiliation, just a grateful user.
What git command line do I run to remove a maintainer?
Git is a dvcs after all.
Nah, you also need to update every shell script to use not instead of git.
My theory is that this is why CLI reforms suchas gitless haven’t taken off.
No need to be authoritarian, you can always make your own fork. Or simply apply the patches.
It’s not authoritarian to think that people who aren’t engaging in good faith shouldn’t be running core tooling projects used by the entire software industry. Applying a fork doesn’t solve the issue that a toxic person is leading a massive community effort.
Furthermore, this isn’t about solving it for me – I know how to use git already. It’s about increasing accessibility for newcomers, who won’t know how to apply patches and recompile.
So long as you only think so.
Where can I see a single example of engaging in bad faith, or any toxicity for that matter?
It could be argued that core tooling shouldn’t change at all, and a change like this would confuse the documentation, or break things. Though this has happened already with the master → main switch, as well as with some changes to the porcelain. git is rather bad from both viewpoints.
Thank goodness neither of those points is relevant to the linked discussion. All of this stuff is backwards-compatible.
This is under the confusing of documentation. The more ways to do it, the more confused it is. Changing terms in any place would also be a source of confusion. I admit to not having read it in detail, but no miracle is possible.
I mostly just scanned it, early on found out it literally lies (“everyone” means “people I agree with”), and figured out it’s just someone publicly moaning, so not worth the attention.
And then there was this comment where someone disrespects other people’s work, of course.
You’re free to fork and improve git. Or even implement your own from scratch.
The more forks and independent implementations, the better the ecosystem – and maybe some ideas filter across.
Well then you better put in the effort to get your fork into distribution repositories.
Why would you be unable to provide downloads and packages?
It’s not like X.org, Jenkins, LibreOffice, and other forks are significantly harder to install than the original.
Yes it is; the way the term “shouldn’t” is used there presumes it is or should be in your authority.
Sure it does: if you do better, people switch projects, and the origin of the fork stops being a massive community effort. How many Hudson developers are there today? How many Jenkins developers are there today?
How about Gogs vs Gittea?
Funny, I’ve never called the index “the staging area,” though I did somehow get into the habit of calling it “the stage,” and using the term “stage” as if it meant a staging area.