1. 68
  1. 43

    To quote a friend: To stop offering Mercurial hosting is bad. To delete the repositories is evil.

    1. 12

      I’m not sure about evil, but yeah this sounds bad. It doesn’t seem that either of these two options would be huge amounts of extra investment:

      • automatically convert hg repos to git repos
      • archive hg repos but keep serving a read-only mirror

      I wonder, does archive.org have the means to mirror the public bitbucket hg repositories?

      1. 14

        I do a lot of work around reproducible builds, and find the deletion of public source code to be quite severe. A lot of important projects don’t get regular maintenance, and it takes quite a lot of work to archive source. Converting to git means the inputs to the build process have changed. This might not be a huge deal for people today, but if you’re trying to rebuild something from a decade ago this is a serious problem.

        1. 6

          I do a lot of work around reproducible builds

          Don’t most big shops (Linux distros, Mozilla, Google) vendor the universe anyway? Specifically to avoid vanishing source code, or even minor network flakiness during build?

          1. 6

            I’m not sure what you mean by “vendor the universe”, but what I have seen is creation of private forks of public open source projects even if there is no intent to modify the code. This has two benefits.

            1. If the author or the hosting provider (e.g. bitbucket) deletes the repository, you still have access to it
            2. Performing a build only requires that one hosting provider be up rather than N
            1. 7

              Vendoring the universe means to, in your builds or deployments etc, pull all dependencies from source you control.

              1. 2

                “Vendoring” is the act of taking source code from your dependencies and including it in your own source tree. Doing that for the whole universe is what you do if you want to be sure that nobody else can break your build.

              2. 1

                Yes, Google has a third_party/ directory where mirrors of OSS code are stored. There is a team that works on the tools to keep things in sync.

            2. 4

              I think back to Gitorious and how they went down and everything they hosted is gone as well. That’s slightly different as the entire company folded, but there are still some things on there which probably didn’t exist anywhere else, which are now gone.

              I remember looking through my creative commons music once, finding a song I liked and trying to lookup the artist and see if they had other stuff. No only could I not find the artist, I couldn’t find the track! After some digging I found their old ccMixter account, from which they deleted all their tracks. The CC song I had literally didn’t existed anywhere I could find (at least under that named) that was indexed by Google/DDG or Bing.

              We look at how much new stuff is created each day. I wonder how much stuff is deleted forever.

              1. 1

                I think back to Gitorious and how they went down and everything they hosted is gone as well. That’s slightly different as the entire company folded, but there are still some things on there which probably didn’t exist anywhere else, which are now gone.

                Interesting in this context is the work by Guix and the Software Heritage to store all source archives/repositories used in Guix.

          2. 71

            Here’s a script to migrate your repos to hg.sr.ht:


            1. 4

              I like competition in general. Sometime I love watching it happen, though. :)

              1. 6

                hey @SirCmpwn thanks a ton for that script. Just imported 19 repos (both git and hg) into sr.ht. Some of those repos were 9 years old. : )

                1. 3

                  Great :)

                2. 1

                  I went ahead and got a subscription for Source Hut even if I may only use it as a mirror for now.

                  I think diversity is good and would like to play more with it in the future :)

                  1. 0

                    I’m trying to do this migration and i’m gettin some errors related to JQ, “parse error: Invalid string: control characters from U+0000 through U+001F must be escaped at line 2, column 34”

                    where the bug track or whatever i can report this to you for? :)

                    1. 2

                      Can you pull down the latest script and try again? I haven’t set up a bug tracker for this script.

                  2. 24

                    After working at Bitbucket in the past, I have conflicting feelings about this.

                    I think the hg UI is cleaner, more intuitive, and consistent, but I also think the project itself has suffered from mismanagement and the refusal to pick a direction and move forward, as shown by the 3-4 different branch tooling mindsets and how hard it is to determine “best practices”.

                    The git UI is a nightmare (simply think of how many things git checkout can do) with inconsistent commands and command line flags, but it’s stayed comparatively simple… and out of the box it works incredibly well.

                    From a technical standpoint, I understand the desire of Bitbucket to drop support for hg (though I don’t think dropping it in the manner they are is good, particularly for open source projects, both legacy and current). The APIs that were written had to be purposefully limited because they had to support both hg and git, but both of them have different branching models (unless you’re using bookmarks, but then you have 2 types of branches with hg). Any optimizations need to be done separately for both git and hg. Plus, if you want to move to a different storage solution, it would need to be implemented for both git and hg.

                    All of this adds up to what honestly makes plenty of business sense: there aren’t enough hg users to warrant the complexity and bugs introduced by supporting 2 VCSs.

                    This is rambling on a bit more than I originally intended…

                    In any sense, I like hg… I really do… but I don’t think there’s any surprise here.

                    1. 12

                      simply think of how many things git checkout can do

                      Happily these issues are slowly being fixed.

                      1. 2

                        That’s awesome to see! Though that was just used as an example. These are tongue in cheek, but provide many more examples: http://stevelosh.com/blog/2013/04/git-koans/ In particular I find the Hobgoblin section the most interesting.

                      2. 4

                        I’m just hopeful that heptapod or sourcehut finally pick up. Not having a good place to host hg has been a big problem for many years; bb has been a bad hg storage location long before they came to this decision.

                        1. 1

                          I wanted to self-host sourcehut, both for hg support and for the email-focused workflow. But after looking at the installation instructions, I decided my time is too limited these days, and installed gitea and migrated my hg repositories to git. Unfortunate, because I do slightly prefer hg to git.

                          1. 4

                            These stories just make me sad. I wish people wouldn’t tell me how awful the Mercurial situation is and that this is why I can’t share hg repos with anyone anymore.

                            I, mean, I know you felt compelled for some reason to tell me how there was no hope, so I can’t tell you to not tell me that. I just wish you hadn’t.

                            1. 1

                              This seems to be similar to my situation as well - a while back I sent in a few patches to make it slightly easier to run (sharing a config file, I think logging config files) but they just sort of sat there until I was told they didn’t add enough value to be useful.

                              Additionally, when I asked if patches to add docker support would be accepted, I was told there was no interest by the lead developer and that running services like this wasn’t the use case for docker.

                              It’s a really interesting project, but after being shut down twice like that, I have very little desire to go back to it.

                          2. 1

                            To be honest it would have made a lot of sense for BB to pick bookmarks as its officially recommended branching model and encourage hg ‘branches’ to die out.

                            1. 4

                              It would have helped a lot if bitbucket had gone “all-in” with hg like Github did with git. Github defined pull requests, vaguely inspired by a git command of the same name that does something very different than how people think of a pull request now.

                              Atlassian instead, from the very beginning when they first bought Bitbucket from Atlassian, decided to start playing follow the leader and catching up with Github, down to watering down the greatest distinguishing feature Bitbucket had: it was Mercurial-only. Bitbucket could have been involved in Mercurial innovation, dedicate more than a single developer to integrating new and exciting workflows like Mercurial Evolve, but they never went all-in.

                              I am unhappy with Atlassian. Just like Github is mostly responsible for git to the point that a lot of people confuse the difference bettween git and github, Bitbucket’s lukewarm approach to Mercurial is the main reason its usage has fallen so drastically. When @belak laments that there are no Mercurial users, let me tell you: there were. Bitbucket just worked very hard to drive them all to git.

                              1. 3

                                I agree that Bitbucket’s lukewarm approach has not helped, but I don’t think it’s the main reason as you stated.

                                hg evolve is still an extension, multiple years later. Part of it is the half-in approach you mentioned, but part of it is that when something is an extension (and not even shipped with actual hg releases), there’s a level of confidence you don’t have in it.

                                Part of why Atlassian was hesitant to go all-in is because when suggestions, proposals, or even donations were made, it took months, sometimes years for anything at all to happen. Even when Sean went to sprints as a representative of Atlassian, this was still the case.

                                You’re welcome to blame Atlassian. I don’t think anything I have to say would change your mind anyway… but I think there’s a strong possibility that there were other factors that contributed more to hg’s fall in usage than Bitbucket.

                                1. 1

                                  Git has also moved slowly… it’s been, what, 14 years and only this year did they finally do something about how complex git checkout is?

                                  I don’t know so much about the relationship between github and git, but I don’t get the impression that the git devs had a very close relationship with Github and quickly accepted their collaboration. Maybe they did and that’s the difference.

                                  Maybe you’re right and I’m just frustrated at the wrong thing. But I never liked how the first thing Atlassian did was add git support. To me that seems like they almost immediately conceded defeat on the Mercurial front. And with nowhere to host it, it becomes a self-fulfilling prophesy of lower number of users, less development involvement, slower changes.

                          3. 20

                            Sad :-( I still think Mercurial far better meets the needs of most people, and that the chief reasons for git’s popularity are that Linus Torvalds wrote it, GitHub, and that Linus Torvalds wrote it.

                            That said, I did end up switching from BitBucket/mercurial to GitHub/git a few years ago, simply because it’s the more pragmatical thing to do and I was tired of paying the “mercurial penalty” in missed patches and the like. I wrote a thing about it a few ago: https://arp242.net/git-hg.html

                            1. 6

                              Why do you think hg is better for most people? I honestly find it vastly more complex to use.

                              1. 15

                                The hg cli is light years ahead of git in terms of intuitiveness.

                                1. 6

                                  I’d say it’s years behind ;)

                                  1. 10

                                    How long have you been using Mercurial? I find most people who dislike Mercurial’s UI, are mainly coming from years of experience with Git. I disliked Mercurial at first as well, but after a few years of forced usage it clicked. Now I appreciate how simple and well composed it is and get frustrated whenever I need to look up some arcane Git flag on StackOverflow.

                                    In general, I’d say you need several years experience with both Git and Mercurial before you can draw a fair comparison.

                                    1. 3

                                      I used mercurial for about 2 years before using git.

                                      1. 3

                                        Sorry if my post came across a bit accusatory (not my intent). In that case I guess to each their own :).

                                      2. 3

                                        but after a few years of forced usage it clicked.

                                        I’m pretty sure that git clicked for me in a much shorter timeframe.

                                        1. 1

                                          Me too, but I know many (otherwise perfectly competent engineers) 5-10 years in who still don’t get it and aren’t likely to.

                                      3. 9

                                        I’m going to strongly disagree. I’ve used git intensively and I find Mercurial to be a well-designed delight. I’ve run across features that Mercurial supports flawlessly, with a nice UI, and Git requires a hacky filter-branch that takes hours to run and doesn’t even behave correctly.

                                        IMO, a lot of the badness in projects is down to Git badness. it doesn’t scale and people feel compelled to break things down into tiny sub-projects.

                                        The only reason Git is winning anything is GitHub’s support of it.

                                        1. 3

                                          The only reason Git is winning anything is GitHub’s support of it.

                                          Why then was github ever used in the first place? Kind of a strange proposition.

                                          1. 1

                                            Network effect of the social network is pretty important.

                                            1. 1

                                              Why would there ever be a network effect in the first place if git was so bad that github was the only reason to use it. I get that the argument technically holds but it seems very unlikely.

                                    2. 8

                                      You find mercurial more complex to use than git? That’s an… unusual view, to say the least. The usual recitation of benefits goes something like this

                                      • Orthogonal functionality in hg mostly has orthogonal commands (compare git commit, which does a half-dozen essentially unrelated different things).
                                      • hg has a somewhat more uniform CLI (compare git branch -a, git remote -v, git stash list).
                                      • hg either lacks or hides a bunch of purportedly-inessential and potentially confusing git functionality (off the top of my head, partial commits aren’t baked into the flow a la git’s index/staging area; and rebasing and history rewriting are hidden behind an extension).

                                      I personally prefer git, but not because I think it’s easier or simpler; I’m more familiar with it, and I find many of those purportedly-inessential functions to be merely purportedly, not actually, inessential.

                                      1. 5

                                        One more thing I like about mercurial that the default set of commands is enough for >90% of people, and that everything else is “hidden” in extensions. This is a very different approach than git’s “kitchen-sink” approach, which gives people 170 commands (vs. Mercurial’s 50, most of which also have much fewer options/switches than git).

                                        Git very much feels like “bloatware” compared to Mercurial.

                                        1. 3

                                          I used git for many years, and then mercurial (at FB) ever since we switched over. The cli interface for mercurial is definitely more sensible, crecord is delightful, and overall it was fine. But I was never able to build a mental model of how mercurial actually worked. git has a terrible interface, but it’s actually really simple underneath.

                                          1. 1

                                            I didn’t think that underneath they were different enough to matter much. What differences do you mean? I guess there’s git’s remote tracking stuff. Generally, it seems like they differ in how to refer to and track commits and topological branches, locally and remotely. (IMHO, neither has great mechanisms for all the things I want to do.) Mercurial is slightly more complex with the manifest, git is more complex with the staging area that feels absolutely critical until you don’t have it (by using hg), at which time you wonder why anyone bothers with it. I’m a heavier hg user than git user, but that’s about all I can come up with.

                                          2. 2

                                            You find mercurial more complex to use than git?

                                            I actually found – in a professional shop – mercurial far more complex to use. Now, the fact is that mercurials core – vanilla hg is IMHO absolutely without doubt vastly superior to git. Git keeps trying to make the porcelain less painful (including a release just a bit ago) – but I still think it is ages behind.

                                            The problem is – I never used vanilla mercurial in a professional environment. Not once. It was always mercurial++ (we used $X extension and $Y extension and do it like $Z) which meant even if I knew hg, I felt painfully inexperienced because I didn’t know mq, share, attic, collapse, evolve, and more… not to mention both the bigger shops I worked with using mercurial has completely custom workflow extensions. I suspect part of this was just the ease of writing mercurial extensions, and part of it was wanting to fall into a flow they knew (mq, collapse). But, regardless of how we got there, at each place I effectively felt like I had to relearn how to use the version control system entirely.

                                            As opposed to git, wherein I can just drop in and work from day one. It might be less clean, it might be more finicky and enable things like history rewriting by default. But at the end of the day, the day I start, I know how to generally function.

                                            I am curious how Mercurial would have faired if instead of shipping default extensions you had to turn on – if they had just baked a little more functionality, to try to cover the 80% of what most shops wanted (not needed, I think most could have gotten by with what vanilla mercurial had) – if the shop to shop transition would have been easier.

                                            1. 2

                                              mq, I think, is responsible for many of the “mercurial is too complicated” complaints people have. Evolve, if it ever stabilized and ships with core hg would really enable some killer capabilities. Sadly for social and technical reasons it’s perpetually in beta.

                                            2. 1

                                              whoa, no index? Admittedly I didnt really use index as intended for several years, but now its an important part of my workflow.

                                              1. 1

                                                In Mercurial, commits are so much easier to make and manipulate (split, fold, move), that you don’t miss the index. The index in git is just a limited special cased “commit”.

                                                1. 3

                                                  The index in git is just a limited special cased “commit”.

                                                  I disagree.

                                                  The index is a useful way to say “these lines of code are ready to go”. If you are making a big commit, it can be helpful to add changes in logical blocks to the index as you go. Then the diff is not polluted with stuff you know is already fine to commit.

                                                  You might say, “why not just make those changes their own commits, instead of trying to do one big commit?” That’s a valid question if you are talking about a 200 line commit or similar, but sometimes the “big” commit is only 50 lines. Instead of making a bunch of one line or few line commits, its helpful to “git add” small chunks, then commit at the end.

                                                  1. 0

                                                    You can as well amend to a commit instead of adding to the index.

                                                    1. 3

                                                      True, but all thats doing is bastardizing the commit process. If you are committing a one line change, just to rebase minutes or hours later, thats not a commit.

                                                      Rebase to me is for commits that were intended to be commits, but later I decided it would be better to squash or change the history. The index is for changes that are never meant to be a full commit on their own.

                                                      1. 1

                                                        Having a distinction between draft and published phases in mercurial I think makes it easier to rewrite WIP work. There’s also a number of UI affordances for it. I don’t miss the index using mercurial. There’s also academic user interface research that shows the index is a big conceptual barrier for new users.

                                                        1. 1

                                                          There’s also academic user interface research that shows the index is a big conceptual barrier for new users.

                                                          this isnt really a valid point in my opinion. some concepts are just difficult. if some goal can be achieved in a simpler way i am on board, but I am not a fan of removing useful features because they are hard to understand.

                                                          1. 1

                                                            But the point is the index is hard to understand and unnecessary.

                                                            There’s no need to have a “commit process”. Just commit whatever you want and rewrite/amend it for as long as you want. As long as your commits are drafts, this is fine.

                                                            Is the problem the word “commit”? Does it sound too much like commitment?

                                                            There’s no need to have two separate ways to record changes, an index, and a commit, each with different degrees of commitments. This is multiplying entities beyond necessity.

                                                            1. 1

                                                              That’s your opinion. The index is quite useful to me. I’d rather make a proper commit once it’s ready, not hack together a bunch of one line commits after the fact.

                                                              1. 2

                                                                The index is a commit. Why have two separate ways of storing the same sort of thing?

                                                                Also, it’s not my opinion that it’s hard to understand and unnecessary; it’s the result of usability studies:


                                                                You’re also not “hacking together” anything after the fact. There’s no more hacking together after the fact whether you use git amend (hypothetically) or git add. Both of those mean, “record additional changes”.

                                                                1. 0

                                                                  It seems you have a fundamental misunderstanding of the difference between add and commit. Commit requires a commit message.

                                                                  1. 1

                                                                    This isn’t a useful distinction. You can also create commits with empty commit messages in both git and Mercurial.

                                                                    With both git and Mercurial you can also amend commit messages after the fact. The index in git could well be implemented as a commit with an empty commit message that you keep amending and you wouldn’t notice the difference at all.

                                                                    1. 1

                                                                      you keep amending and you wouldn’t notice the difference at all.

                                                                      yeah, you would. again it seems that you either dont know git, or havent used it in some time. when you amend a commit, you are prompted to amend the message as well. another facet that doesnt exist with git add, because add doesnt involve a message.

                                                                      if you wish to contort git internals to suit your agenda thats fine, but git add has perfectly valid use cases.

                                                                      1. 0

                                                                        you are prompted to amend the message as well.

                                                                        This is UI clutter unrelated to the underlying concepts. You can get around that with wrappers and aliases. I spoke of a hypothetical git amend above that could be an alias that avoids prompting for a commit message.

                                                                        Don’t git users like to say how the UI is incidental? That once you understand the data structures, everything else is easy? The UI seems to have locked you into believing the index is a fundamentally necessary concept, but it’s not. It’s an artifact of the UI.

                                                                        1. 1

                                                                          The UI seems to have locked you into believing the index is a fundamentally necessary concept, but it’s not.

                                                                          Nothing has locked me into believing its a necessary concept. Its not necessary. In fact, for about 7 years I didnt use the index in any meaningful way.

                                                                          I think what you are missing is that Im not compelled to use it because its the default workflow, I am compelled to use it because its useful. It helps me accomplish work more smoothly than I did previously, when I would just make a bunch of tiny commits because I didnt understand the point of the index, as you still dont.

                                                                          The argument could be made to move the index into an option, like somehow make commit only the default workflow. Im not sure what that would look like with Git, but I dont think its a good idea. It would just encourage people to make a bunch of smaller commits with meaningless commit messages.

                                                                    2. 1

                                                                      You have a set of things you want to accomplish. With git, you have N+1 concepts/features/tools to work with. With hg, you have N (because you drop the index). That means you have to expand your usage of the remaining N.

                                                                      Specifically, since you no longer have this extra index concept, you now expand commits to cover the scenarios you need. Normally, you’d make an initial commit and then amend a piece at a time (probably with the interactive curses hunk selector, which is awesome.) If you’re unsure about some pieces, or you have multiple things going on that you’d like to end up in separate commits, you can always make a series of microcommits and then selectively collapse them later. (In practice, it’s even easier than this, because of the absorb extension. But never mind that.)

                                                                      Yes, those microcommits need commit messages. They don’t need to be good ones, because they’re temporary until you squash them out of existence. I usually use a one word tag to specify which of the separate final commits they belong to. (If you don’t have separate final commits, you may as well amend, in which case no messages are needed.)

                                                                      …or on the other hand, maybe mercurial ends up with N+1 concepts too, because phases really help in keeping things separate. As I understand it, one reason git users love the index is because it keeps rapidly changing, work in progress stuff separate from mostly set in stone commits. Phases perform the same purpose, but more flexibly, and the concepts are more orthogonal so they compose better. In my opinion.

                                            3. 6

                                              I never particularly liked git and find it unintuitive, too.

                                              I wouldn’t consider myself a git poweruser. But whenever I had to work with alternatives I got the feeling that they’re just inferior versions of git. Yeah, maybe the usage was a bit more intuitive, but all of them seemed to lack things that I’d consider really basic (bisecting - hg has that, but e.g. svn has not - and shallow copying - not avaible in hg - are examples what I often miss).

                                              1. 3

                                                Mercurial was actually my first DVCS, and like you I ended up switching to git not out of a sense that it was technically better, just more pragmatic. For me, the change is more of a mixed bag, though. It is definitely the case that Mercurial’s UI is worlds better, and revsets in particular are an amazing feature that I sorely miss, but when I made the switch I found that the way git handles branches was much more intuitive to me than Mercurial’s branch/bookmark system, and that the affordances around selectively editing commit histories were very much worth the risk in terms of being able to manage the narrative of a project’s history in a way that makes it more legible to others. Ultimately, I found that git’s advantages outweighed its downsides for my use case, since learning its UI idiosyncrasies was a one-time cost and since managing branches is a much more common occurrence for me than using revsets. That said, I think this is a really unfortunate development.

                                                1. 2

                                                  I occasionally convert people’s git repos to hg for my use. Stubborn like that.

                                                2. 16

                                                  No need for my Atlassian account anymore…

                                                  1. 15

                                                    Agree. The only reason I had a BitBucket account was my mercurial repositories.

                                                    If only Atlassian could sunset JIRA. That would be nice…

                                                    1. 12

                                                      If only Atlassian could sunset JIRA. That would be nice…

                                                      Like all right-thinking people, I detest JIRA and every microsecond I spend in it feels like a million agonizing years, but what’s the alternative for bug tracking? Most software of this ilk is not purchased by the people who have to use it, so it responds not to actual user pressure, but to CTO sales pressure. That’s my pet theory about while enterprise software is uniformly terrible, at least.

                                                      1. 6

                                                        That’s my pet theory about while enterprise software is uniformly terrible, at least.

                                                        That’s quite close to the theory of the old-timers I’ve asked about it, but there’s an important difference.

                                                        CTOs ask consultants what software they should use. Consultants who recommend software that’s simple and easily configured go out of business, because most of the money is in helping clients configure/install/start using software.

                                                        1. 3

                                                          I like Phabricator much better, and it’s free software too.

                                                          1. 2

                                                            GitHub issues are fine.

                                                          2. 1

                                                            I do not understand the hate against JIRA. I think it is good software with many useful features. Yes, it can be abused to make tracking your issues really bad, but that is problem of those who use the software and not the software itself.

                                                          3. 4

                                                            Good luck actually closing your Atlassian account though :-( I’ve tried to do it many times but still get email from them occasionally when they discover vulnerabilities in products I’ve never used.

                                                          4. 10

                                                            End of an era, I suppose. There’s no such thing as a healthy monoculture, just monocultures that haven’t found their blight yet.

                                                            1. 33

                                                              According to that Stack Overflow survey there are plenty of popular alternatives, such as “ZIP file back-ups”, “Copying and pasting files to network shares”, and “I don’t use version control”.

                                                              1. 18

                                                                Did the ever popular “Final FINAL Copy 2” make it in? That’s got to be the number one version control system, right?

                                                                1. 9

                                                                  My first programming job was like that. I was working as a repair tech at a computer shop, at some point they needed someone to clean up the intranet, and that someone was me.

                                                                  First thing I did was set up svn and get rid of all the “foo.orig”, “foo.orig2”, etc directories. This was trickier as it might seem as some of the projects were being served from those .orig2 dirs.

                                                                  All was going well, and then half a year later the guy who had been doing this asked me if knew what happened to the “cms.orig” directory. After I told him I deleted it he said he had been storing the company outings photos there for the last 15 years. By the time we discovered it was too late to recover from backup, so … all lost.

                                                                  I still don’t understand why you would store pictures in a deeply nested subdir of an otherwise unused cms.orig …. 🤨 from what I heard he ditched svn and went back to his “system” after I left.

                                                              2. 13

                                                                Well, just to play a devil’s advocate… Some things are so good exactly because they are ubiquitous. Like Unicode, for example. Or POSIX. They have their flaws for sure, but they made writing interoperable software much easier.

                                                                There are already other tools that work with the same repository format as the Torvalds’ git. Maybe git format becoming the standard repo format is a good thing after all. No one has to use the reference implementation if they prefer different UI and abstractions.

                                                                1. 14

                                                                  Maybe git format becoming the standard repo format is a good thing after all.

                                                                  No, it’s definitely not. It doesn’t scale. The git “API” is literally the local filesystem. Microsoft has valiantly hacked the format into functioning at scale with VFS for Git, but the approach is totally bananas.

                                                                  How does it work?

                                                                  VFS for Git virtualizes the filesystem beneath your Git repository so that Git tools see what appears to be a normal repository when, in fact, the files are not actually present on disk. VFS for Git only downloads files as they are needed.

                                                                  VFS for Git also manages Git’s internal state so that it only considers the files you have accessed, instead of having to examine every file in the repository. This ensures that operations like status and checkout are as fast as possible.

                                                                  - vfsforgit.org

                                                                  Microsoft had to implement an entire virtual filesystem that, through smoke and mirrors, tricks git to behave sanely. More details in this GVFS architecture overview.

                                                                  1. 4

                                                                    Isn’t the same true for Mercurial and every other DVCS in existence?

                                                                    1. 17

                                                                      No. Git’s remote repository API is nothing more than a specialized rsync implementation (git-send-pack and git-receive-pack).

                                                                      Mercurial uses a semantic API for exchanging changes with the server. It doesn’t need local files in the same way git does. That opens up a lot of doors for scaling large repositories, because you can implement optimizations in the client, protocol, and server.

                                                                      For git repos, where local filesystem operations are the protocol, there really is no alternative to Microsoft’s smoke and mirrors, virtualize the world approach. You’d have to just reimplement git, which defeats the point.

                                                                      1. 1

                                                                        Ah, that is interesting. Thanks for the information, I should look into the way mercurial actually works.

                                                                        1. 3

                                                                          If you’re curious about the actual on-disk formats (which should be irrelevant, as hg tries to compartmentalise them), you can read about Mercurial’s internals.

                                                                    2. 4

                                                                      I don’t see anything wrong with git using the local file system API.

                                                                      There are multiple implementations of such file systems – Linux, FreeBSD, OS X, Minix, etc. git works fine on all those systems, and the code is portable AFAICT.

                                                                      1. 8

                                                                        So, I personally love how git transparently exposes its internal data structures for direct manipulation by the user. It gives you tons of power and freedom. Back when I used git, I considered it just as much a development tool as my editor.

                                                                        But that transparency is problematic for scaling. To the point where you really do need to implement a virtual remote filesystem tailored for git to support huge repos. Whether you like git or not, that’s bananas.

                                                                        1. 5

                                                                          There’s nothing bananas about that: scaling is a feature and it’s not surprising that you need more code/engineering to scale. It would be surprising if you didn’t!

                                                                          To make a very close analogy, two companies I worked at used Perforce (the proprietary VCS). At one company we used it out of the box, and it worked great. Thousands of companies use Perforce like this, and Perforce is a very profitable company because as a result.

                                                                          The second company (Google) also used Perforce out of the box. Then we needed to scale more, so we wrote a FUSE-based VFS (which I imagine the git VFS is very similar to). That doesn’t mean Perforce is “bananas”. It works for 99% of companies.

                                                                          It’s just designed for a certain scale, just like git is. Scale requires a lot of tradeoffs, often higher latency, decreased throughput, etc. git seems to have made all the right tradeoffs for its targeted design space. That it succeeded beyond the initial use case is a success story, not an indication of problems with its initial design.

                                                                          Also, I don’t see any evidence that Mercurial reached the same scale. It probably has different problems – you don’t really know until you try it. I heard some teams were working on scaling Mercurial quite awhile ago [1], but I’m not sure what happened.


                                                                          1. 4

                                                                            Then we needed to scale more, so we wrote a FUSE-based VFS

                                                                            I currently work at Google. CitC has nothing to do with Piper performance, it’s more about the utility of sharing your workspace, both between dev machines and tools (desktop, cloudtop, cider, critique), as well as blaze.

                                                                            (which I imagine the git VFS is very similar to).

                                                                            Not at all. The git “protocol” is filesystem operations. Microsoft made VFS for Git because they need to intercept filesystem operations to interface with the git toolchain. Perforce and Mercurial have actual remote APIs, git does not.

                                                                            That doesn’t mean Perforce is “bananas”. It works for 99% of companies.

                                                                            I don’t think Perforce is bananas. I don’t think git is bananas either. I specifically think “git format becoming the standard repo format” is NOT a good thing. The git toolchain and the repo format are inseparable, leading to Microsoft’s bananas implementation of a scalable git server. Clever and impressive, but bananas.

                                                                            1. 2

                                                                              What I’m reading from your comments is: “If only git had decoupled its repo format and push/pull protocol, then it would be more scalable”.

                                                                              I don’t think that’s true. You would just run into DIFFERENT scalability limits with different design decisions. For example: Perforce and Mercurial don’t share that design decision, as you say, but they still have scalability limits. Those designs just have different bottlenecks.

                                                                              Designing for scale you don’t have is an antipattern. If literally the only company that has to use a git VFS is Microsoft, then that’s a fantastic tradeoff!!!

                                                                              IMO Google’s dev tools are a great example of the tradeoff. They suffer from scalability. They scale and solve unique problems, but are slow as molasses in the common case (speaking from my experience as someone who worked both on the dev tools team and was a user of those tools for 11 years)

                                                                              1. 2

                                                                                I don’t think that’s true. You would just run into DIFFERENT scalability limits with different design decisions.

                                                                                Git was probably strongly tied to the filesystem because it was made in 2005 (Pentium 4 era) for a lower-performance scenario by someone who understood the Linux filesystem better than high-performance, distributed applications. It worked for his and their purposes of managing their one project at their pace. Then, wider adoption and design inertia followed.

                                                                                It’s 2019. Deploying new capabilities backwards compatible with the 2005 design requires higher, crazier efforts with less-exciting results delivered than better or more modern designs.

                                                                                1. 1

                                                                                  “If only git had decoupled its repo format and push/pull protocol, then it would be more scalable”.

                                                                                  It would be easier to scale. When the simplest and easiest way to scale genuinely is implementing a client-side virtual filesystem to intercept actions performed by git clients, that’s bananas. To be clear, VFS for Git is more than a simple git-aware network filesystem, there’s some gnarly smoke and mirrors trickery to make it actually work. The git core code is so tightly coupled to the file format, there’s little else you could do, especially if don’t want to break other tooling using libraries like libgit2 or jgit.

                                                                                  Designing for scale you don’t have is an antipattern.

                                                                                  Designing tightly coupled components with leaky abstractions is an antipattern. Mercurial supports Piper at Google through a plugin. Doing the same with git just isn’t possible, there’s no API boundary to work with.

                                                                          2. 3

                                                                            To the best of my knowledge it still uses the intricate knowledge of filesystem behaviour to avoid (most?) fsync calls — and the behaviour it expects is ext3 (which is better at preserving operation order in case of crash than most other filesystems).

                                                                            I actually had a hard poweroff during/just after commit corrupt a git repository.

                                                                            So, in a way, the API it actually expects is often not provided…

                                                                            1. 1

                                                                              Do you mean its reliance on atomic rename when managing refs? Or some other behavior?

                                                                              1. 3

                                                                                I would hope that atomic renames are actually expected from a fullu-featured POSIX FS (promised by rename manual page etc).

                                                                                But Git also assumes some ordering of file content writes without using fsync.

                                                                                1. 2

                                                                                  Renames are only atomic with respect to visibility in a running filesystem, not crash safety, though. So I guess it’s not surprising you’ve seen corruption on crash.

                                                                                  I’ve been trying to find a way to run a small-scale HA Git server at work — as far as I can tell the only viable option is to replace the whole backend as a unit (e.g., GitHub DGit/Spokes, GitLab’s Gitaly plans). Both GitHub and GitLab started by replicating at a lower level (drbd and NFS, respectively), but moved on. I can say from experience that GlusterFS doesn’t provide whatever Git requires, either.

                                                                                  1. 1

                                                                                    There is at least some pressure to provide metadata atomicity on crash (you cannot work around needing that, and journalled filesystems have a natural way to provide it), but obviously data consistency is often the same as without rename. And indeed there are no persistency order guarantees.

                                                                                    Systems like Monotone or Fossil outsources the consistency problem to SQLite — which uses carefully ordered fsync calls to make sure things are consistent — but Git prioritised speed over portability outside ext3. (Mercurial is also not doing fsync, though)

                                                                                    And if ext4 doesn’t provide what Git needs, of course GlusterFS won’t…

                                                                    3. 10

                                                                      Eating the world considered harmful.

                                                                      Mercurial seems to have a lot of sentimental support — being the saner and more intuitive DVCS

                                                                      This is one of those things that’s just repeated over and over and over and assumed to be true. Kind of like “Vim HEAD runs on Windows 95 (and OS/2)!” was assumed to be true in 2013 (spoiler: no one actually tried).

                                                                      1. 7

                                                                        This is one of those things that’s just repeated over and over and over and assumed to be true.

                                                                        For the most part it is definitely true. The git command line is notoriously inconsistent, whereas hg is not. It’s also harder to shoot yourself in the foot with hg.

                                                                        The one thing that (IMO) is a mess with hg is branching, which day to day has far more impact than remembering an inconsistent command interface. At a previous employer (who I’d convinced to move to hg from svn) we ended up with a horribly complex bookmarking strategy to deal with short-lived feature branches.

                                                                        I’m marginally sad that hg is losing, because it got some things very right. But git on the whole gets more of the things that matter right, despite the warts.

                                                                        1. 6

                                                                          The command line is only one part of it, though.

                                                                          Conceptually, I find git much easier to understand. I “get it” that it’s a DAG and I can create branches, name them, push them, switch between them, etc. Maybe the command line is obtuse, but nowadays I just use magit anyway.

                                                                          I spent almost 6 years using Mercurial at a previous job, and never really understood stuff like branching. To be fair, Kiln’s weird forking/branching didn’t help, but it was only part of a bigger problem.

                                                                          1. 4

                                                                            It’s also harder to shoot yourself in the foot with hg.

                                                                            My intention is not to put words in your mouth, but IME the Git footguns are somewhat exaggerated.

                                                                            Yes, you can squash into a merge commit and mess up the parentage vis a vis the remote, and make other messes, but Git also comes with great recover tools.

                                                                            If you manage to make a mess, you can clean it up by cherry-picking into a temp branch and resetting your mess to that --hard. In other instances the reflog can be useful.

                                                                            Also not saying a rebase pulling in commits from reflog is Git 101, and many people may just rebuild the commits from their editor’s undo buffer, or from memory.

                                                                            I am saying that every argument wrt footguns should come with the note that you should tag backups of your state before a risky endeavor, and that recovery is possible.

                                                                            Personally I’ve had only a fistful of real messes (mainly accidents made so tired I should not have been working) during my maybe 10-12 years of Git usage, and remembering the recovery strategies, I can’t remember once rebuilding the commits.

                                                                            1. 2

                                                                              Just because git gives you the tools to fix the issue, doesn’t mean that the initial mistake is less likely.

                                                                              It’s been a while since I used hg in anger, but from memory there’s no real way to accidentally merge the wrong remote branch into your local branch (because in general you push/pull the whole repo, so the concept of ‘remote branch’ doesn’t really exist). I’ve done this a ton of times with git, simply through muscle memory.

                                                                              Sure, it’s not a catastrophic mistake, but it’s still one that git allows you to make.

                                                                              As I said, I’m in the git camp. Recently I had to merge multiple repos into one, taking only one branch from each, and retaining the history. It was surprisingly painless in git (thanks to merge --allow-unrelated-histories). I’m not sure how it would have gone with hg, but I assume it would have been a lot more painful.

                                                                              1. 1

                                                                                I’ve had one moment of panic with git. That was when I did git reset --hard on a file after editing it and then remembering I needed those changes. But then I remembered that I had added the changes to the index, and was able to recover them by looking at the dangling objects there.

                                                                            2. 4

                                                                              Author here: this is the main argument I got from Mercurial users (scrolling through the comments). I use Git myself (and only glanced at Mercurial and have no opinion on it). So maybe my sentence (wrongly) implies that I think this, but I mean to say that Mercurial users tend to give this argument.

                                                                              1. 4

                                                                                I’ve used a variety of VCS over the years, including both git & hg, the latter of which has always felt easier to use than the former - not so much because hg is particularly easy but because git is particularly confusing. It’s kinda like how Linus’ other major project was for many years until Ubuntu decided to make it easier to use.

                                                                            3. 5

                                                                              As much as I prefer git to mercurial, I think this is very unfortunate news. More competition in the space is always better for users, and I suspect this will consign mercurial to nigh-CVS levels of relevance ;).

                                                                              1. 8

                                                                                To me Mercurial was nothing but a barrier to collaboration. Maybe it’d be fine in a parallel universe where it won, but in this one it was yet another odd tool to learn.

                                                                                I’ve been told it’s much better than git, but when I tried it I found it merely had a different set of quirks. I’ve created a branch and couldn’t find a way to delete it (apparently there’s closing vs stripping, and all my experimental garbage branches called asdfsadlkjl were supposed to keep permanent tombstones? No thanks). I’ve tried to clean it up by rebasing, but it required adding a plugin in 3 places.

                                                                                1. 7

                                                                                  Mercurial “branches” are very different from git branches. Yeah, the naming is confusing now because git won, and everybody thinks in terms of git naming conventions. In my experience, mercurial is less inherently quirky than git. As other people have said, the UI is much much cleaner for hg. git and hg are almost identical internally, hence why it’s possible to use git-hg and hg-git bridges. The only real differences are in UI, culture around usage, and the tooling ecosystem. On both UI and culture, mercurial wins imho, but the ecosystem is much much bigger for git. Today, the ecosystem for mercurial just got that much worse sadly…

                                                                                  1. 3

                                                                                    I’m happy to believe hg is better. That’s not a big ask.

                                                                                    I already have to know git in order to function in my part of the industry. HG is another, also-imperfect thing I’d have to learn.

                                                                                  2. 1

                                                                                    Just add some more detail to the discussion about branches: hg’s branches are indeed more heavy weight and permanent than git’s. If you want git style branches, hg’s bookmarks are what you’re looking for.

                                                                                    The last time I looked Bitbucket still didn’t support creating PRs from hg bookmarks, which was annoying for a feature branch/PR style workflow. Their focus had been away from hg support for several years.

                                                                                  3. 4

                                                                                    That’s unexpected.

                                                                                    1. 4

                                                                                      Only if you haven’t been paying attention, the writing has been on the wall for a while now.

                                                                                    2. 6

                                                                                      I dont really see this as being that big of a problem.

                                                                                      All things considered I think Git is the better of the two. From the other thread I discovered Mercurial doesnt use an index, so thats a dealbreaker for me. Git certainly has its problems, I am not sure if youd call it feature bloat, but Git is big. This is likely from it being written by Linus, combined with the fact that Junio essentially worked on it full time for many years.

                                                                                      Also its not ideal that Git is written in C, Python, Perl and Shell. I would prefer something like Fossil that has an easy single executable.

                                                                                      Both Git and Mercurial are open source, so if anyone doesnt like Git enough they can make their own. Granted without funding from Microsoft or similar it likely wont enjoy the wide adoption of Git, but people can use what works for them. Just like Linus made Git, people can create a new version control, or fork Git to what works for them.

                                                                                      1. 14

                                                                                        From the other thread I discovered Mercurial doesnt use an index, so thats a dealbreaker for me.

                                                                                        You don’t need an index. A commit is the same thing. Rewriting the index with git add -p or similar is the same as rewriting a (possibly secret) commit with hg amend --interactive.

                                                                                        If you really want an extra step to make you feel safe, you can change your default phase to secret so all of your newly-created commits are secret and unshareable until you manually move them to the draft phase so you can share them.

                                                                                        The index is really just that: an unshareable intermediate location to store work-in-progress changes. With commits in the secret phase, you can even have multiple indices if you wish.

                                                                                        1. 2

                                                                                          You don’t need a secret commit. An index is the same thing. Rewriting a (possibly secret) commit with hg amend --interactive or similar is the same as rewriting the index with git add -p.

                                                                                          etc. etc. Git stashes are a way to save multiple indices, too.

                                                                                          1. 6

                                                                                            The UI problem with the index is that it’s a separate location with a separate UI that confuses users. Where is my code, they ask. Well, your code could be in the working directory, in the staging area, or in the commit. The commands to move it in and out of each of those locations are all different, and the various commands to show you the status of your code and your repo are also different depending on which of those three locations your code could be in.

                                                                                            The index/staging area/cache introduces a new concept and new commands. The point is you don’t need any of those. You can do just fine with commits, reducing from 3 to 2 the number of possible locations of your code, as well as getting rid of all of the commands to manipulate this extra third intermediate location.

                                                                                            The index is multiplying entities beyond necessity.

                                                                                        2. 2

                                                                                          Despite the headline the article wasn’t at all critical of git. It was more like quick overview of how git won in the end despite some arguably better features in other dvcs systems.

                                                                                          I agree it’s not a problem, git is a good enough system that really rewards users for taking the time to learn its quirks. At the same time I hope pijul gets a better name and some followers. It looks interesting.

                                                                                        3. 3

                                                                                          Mercurial support going away in Bitbucket is not surprising. It’s not exactly the first time a technology has beat arguably technically better tools, and it won’t be the last.

                                                                                          1. 3

                                                                                            This is why you should really consider hosting your own, if at all possible. In a post of a few years ago I listed several code hosting sites which shut down. Even though it might seem inconceivable now, it is possible that for example GitHub might go down (all that requires is for a better competitor to show up, or for them to start doing retarded things like bundling adware, like SourceForge did). Remember, hosting all the open source software projects for free actively costs them money.

                                                                                            1. 2

                                                                                              Any suggestion for a Mercurial hosting service?

                                                                                              1. 1

                                                                                                So let me get this straight: Atlassian treat Mercurial as the proverbial ginger child in the family for years, and then act all fucking surprised when less people use it on their service?

                                                                                                COLOUR ME FUCKING SHOCKED.

                                                                                                1. 1

                                                                                                  While this is sad, I suppose it’s good news for Fossil.

                                                                                                  1. 1

                                                                                                    BTW, nginx is still using Mercurial for version control. Not exactly sure why — I think they’ve moved from svn to hg when Git has already won the war.

                                                                                                    Mozilla is another big user of hg; they’ve decommissioned cvs, bzr and git.

                                                                                                    I think this will be kinda the end of Bitbucket.

                                                                                                    1. 0

                                                                                                      Can someone explain to me why there are so many “blockchain” startups when instead these should just be using git under the hood?