1. 17

    1. 4

      Well, good question. It comes from the sha1 of the content. If you take the zlib compressed data and pipe it through sha1sum, you get the filename.

      Interesting to note: This is still SHA1 by default, even if your git installation is up to date [1]. I was confused because I believed that the SHA256 transition already happened, but you do have to explicitly enable it when creating the repository, with git init --object-format=sha256.

      [1] https://git-scm.com/docs/git-init#Documentation/git-init.txt---object-formatltformatgt

      1. 10

        Last time I checked server side support in major forges (Github, Gitlab) of sha256 repositories was basically non-existent (as in: you can’t push this kind of repo there).

        That’s a really sad state of affairs. It seems that the collision detection variant of sha1 was considered by them “good enough”. (Or maybe someone has better/more recent info on that subject, if so, please share)

        1. 2

          From what I understand, between SHA1 and SHA2, the tradeoff is speed (former being faster). Now the way git uses SHA1 (for purely hashing/fingerprinting purpose), SHA1 is “good enough” and collisions are not going to break any security guarantee (no collision resistance is required). On the other hand, using SHA1 for offering a security guarantee, say, digital signatures or authentication schemes, is problematic, which is where the world has moved to SHA2.

          Maybe that is the reason SHA1 is still the default in git, since it is “good enough”, and not security guarantees are necessary.

          1. 6

            But git does use SHA1 for security guarantees, in particular for digital signatures over tags and commits. And a SHA1 collision would also break the guarantee that a git hash identifies a unique commit and all its history.

            Part of the reason there is a lack of urgency in moving git to SHA2 is because git does not use pure SHA1: instead it has extra checks to ensure objects do not tickle SHA1 in a way that can lead to collisions. It also helps that git’s object format is not very malleable, so it would be very difficult to create a collision (unlike PDF).

          2. 4

            Maybe that is the reason SHA1 is still the default in git, since it is “good enough”, and not security guarantees are necessary.

            Sadly this means that if you sign a git commit or a tag then even if the signature algorithm may be stronger than SHA1 the entire thing is still offers only SHA1 guarantees since the signature is over text with SHA1 based references (parent commit, tree etc).

            I’m no cryptographer but there are supply chain security tools which use these signatures (such as sq git) and it may be problematic for them.

    2. 1

      Cool post, combined with https://articles.foletta.org/post/git-under-the-hood/, it makes for a formidable pair to understand “git under the hood”.

    3. 1

      Nice! We do something similar in our Git lecture notes and slides for ISDT.