The page mentions git specifically as being vulnerable. While I’m sure that’s true, it seems highly impractical to attempt to move git away from SHA1. Am I wrong? Could you migrate away from SHA1?
[Edit: I forgot to add, Google generated two different files with the same SHA-1, but that’s dramatically easier than a preimage attack, which is what you’d need to actually attack either Git or Mercurial. Everything I said below still applies, but you’ve got time.]
So, first: in the case of both Mercurial and Git, you can GPG-sign commits, and that will definitely not be vulnerable to this attack. That said, since I think we can all agree that GPG signing every commit will drive us all insane, there’s another route that could work tolerably in practice.
Git commits are effectively stored as short text files. The first few lines of these are fixed, and that’s where the SHA-1 shows up. So no, the SHA-1 isn’t going anywhere. But it’s quite easy to add extra data to the commit, and Git clients that don’t know what to do will preserve it (after all, it’s part of the SHA-1 hash), but simply ignore it. (This is how Kiln Harmony managed to have round-trippable Mercurial/Git conversions under-the-hood.) So one possibility would be to shove SHA-256 signatures into the commits as a new field. Perfect, right?
Well, there are some issues here, but I believe they’re solvable. First, we’ve got a downgrade vector: intercept the push, strip out the SHA-256, replace it with your nefarious content that has a matching SHA-1, and it won’t even be obvious to older tools anything happened. Oops.
On top of that, many Git repos I’ve seen in practice do force pushes to repos often enough that most users are desensitized to them, and will happily simply rebase their code on top of the new head. So even if someone does push a SHA-256-signed commit, you can always force-push something that’ll have the exact same SHA-1, but omit the problematic SHA-256.
The good news is that while the Git file format is “standardized,” the wire format still remains a bastion of insanity and general madness, so I don’t see any reason it couldn’t be extended to require that all commits include the new SHA-256 field. I’m sure this approach also has its share of excitement, but it seems like it’d get you most of the way there.
(The Mercurial fix is superficially identical and practically a lot easier to pull off, if for no other reason than because Git file format changes effectively require libgit2/JGit/Git/etc. to all make the same change, whereas Mercurial just has to change Mercurial and chg clients will just pick stuff up.)
It’s also worth pointing out that in general, if your threat model includes a malicious engineer pushing a collision to your repo, you’re already hosed because they could have backdoored any other step between source and the binary you’re delivering to end-users. This is not a significant degradation of the git/hg storage layer.
(That said, I’ve spent a decent chunk of time today exploring blake2 as an option to move hg to, and it’s looking compelling.)
Edit: mpm just posted https://www.mercurial-scm.org/wiki/mpm/SHA1, which has more detail on this reasoning.
Plenty of people download OSS code over HTTPS, compile it and run the result. Those connections are typically made using command line tools that allow ancient versions of TLS and don’t have key pinning. Being able to transparently replace one of the files they get as a result is reasonably significant.
Right, but if your adversary is in a position that they could perform the object replacement as you’ve just described, you were already screwed. There were so many other (simpler!) ways they could own you it’s honestly not worth talking about a collision attack. That’s the entire point of both the linked wiki page and my comment.
That said, since I think we can all agree that GPG signing every commit will drive us all insane, there’s another route that could work tolerably in practice.
It is definitely a big pain to get gpg signing of commits configured perfectly, but now that I have it setup I always use it and so all my commits are signed. The only thing I have to do now is enter my passphrase the first time in a coding session that I commit.
Big pain? Add this to $HOME/.gitconfig and it works?
gpgsign = true
Getting gpg and gpg-agent configured properly and getting git to choose the right key in all cases even when sub keys are around were the hard parts.
That’s exactly what I did.
Sorry, to rephrase: mechanically signing commits isn’t a big deal (if we skip past all the excitement that comes with trying to get your GPG keys on any computer you need to make a commit on), but you now throw yourself into the web-of-trust issues that inevitably plague GPG. This is in turn the situation that Monotone, an effectively defunct DVCS that predates (and helped inspire) Git, tried to tackle, but it didn’t really succeed, in my opinion. It might be interesting to revisit this in the age of Keybase, though.
I thought GPG signing would alleviate security concerns around SHA1 collisions but after taking a look, it seems that Git only signs a commit object. This means that if you could make a collision of a tree object, then you could make it look like I signed that tree.
Is there a form of GPG signing in Git which verifies more than just the commit headers and tree hash?
You are now looking for a preimage collision, and the preimage collision has to be a fairly rigidly defined format, and has to somehow be sane enough that you don’t realize half the files all got altered. (Git trees, unlike commits, do not allow extra random data, so you can’t just jam a bunch of crap at the end of the tree to make the hash work out.) I’m not saying you can’t do this, but we’re now looking at SHA-1 attacks that are probably not happening for a very long time. I wouldn’t honestly worry too much about that right now.
That said, you can technically sign literally whatever in Git, so sure, you could sign individual trees (though I don’t know any Git client that would do anything meaningful with that information at the moment). Honestly, Git’s largely a free-for-all graph database at the end of the day; in the official Git repo, for example, there is a tag that points at a blob that is a GPG key, which gave me one hell of a headache when trying to figure out how to round-trip that through Mercurial.
Without gpg signing, you can get really bad repos in general. The old git horror story artile highlights these issues with really specific examples that are more tractable.
Though, I don’t want to start a discussion on how much it sucks to maintain private keys, so sorry for the sidetrack.
I don’t see why GPG-signed commits aren’t vulnerable. You can’t modify the commit body, but if you can get a collision on a file in the repo you can replace that file in-transit and nothing will notice.
Transparently replacing a single source code file definitely counts as ‘compromised’ in my book (although for this attack the file to be replaced would have to have a special prelude - a big but not impossible ask).
Here’s an early mailing list thread where this was brought up (in 2006). Linus’s opinion seemed to be:
Yeah, I don’t think this is at all critical, especially since git really
on a security level doesn’t depend on the hashes being cryptographically
secure. As I explained early on (ie over a year ago, back when the whole
design of git was being discussed), the security of git actually depends
on not cryptographic hashes, but simply on everybody being able to secure
their own private repository.
the security of git actually depends on not cryptographic hashes, but simply on everybody being able to secure their own private repository.
This is a major point that people keep ignoring. If you do one of the following:
then the argument that SHA3, or SHA256 should be used over SHA1 simply doesn’t matter.
And here’s the new thread after today’s announcement
(the first link in Joey Hess’s e-mail is broken, should be https://joeyh.name/blog/entry/sha-1/ )
@Irene or others, maybe merge this in here?
It may be interesting to check out the Google Security Blog post which provides more information than the OP link
Uh, no. It provides exactly the same information.
Wish they had more info on the prototype GPU.