1. 65
  1.  

  2. 27

    Ah, there it is. I was wondering when this would happen.

    Facebook used to be involved with the Mercurial community, but it was difficult to work with them. They always wanted to do things their way, had their own intentions, and started to demand that the Mercurial project work the way that Facebook wanted. For example, they demanded that we start using Phabricator and started slowly removing sequential revisions from Mercurial in favour of always using node hashes everywhere, arguing that for their gigantic repos, sequential revisions were so big as to be useless.

    Eventually the disagreements were too great, and Facebook just stopped publicly talking about Mercurial.

    I figured they would emerge a few years later with their fork of it. They love doing this. HipHop VM for PHP, Apache Hive, MyRock; these are examples of Facebook forking off their development in private and then later emerging with some thing they built on top of it.

    The Mercurial project is surprisingly still chugging along, and there are still those of us who actually use Mercurial. I doubt I’ll switch over to Sapling, because I disagreed with the things that made Facebook fork off in the first place. But if others like Sapling and this manages to put the slightest dent into the git monoculture, I’m happy for the change and innovation. I really hope that git is not the final word in version control. I want to see more ideas be spread and that people can see that there can be a world beyond git.

    P. S. Absorb is fantastic and one of Jun Wu’s best contributions to Mercurial (and therefore to Sapling). I want everyone to know about this tool, it’s amazing. I had fun trying to come up with a name for the feature, thanks for the help.

    https://lobste.rs/s/nws1uj/help_us_name_new_mercurial_featur

    1. 35

      As someone who was very active in the Mercurial community at the same time as you, I have a more charitable take on matters.

      The Facebook people were generally pleasant to work with and seemed to be excited to contribute to open source. They were also in an uncomfortable position of having to navigate contributing to a project not under their full control and hitting deadlines to prevent VCS at Facebook from hitting a figurative wall and undermining org-wide productivity. They were answering to two demanding bosses.

      In hindsight, I’m surprised how long Facebook was able to contribute to Mercurial before they felt they needed to go their own way. They really made an effort and a lot of people at Facebook stuck out their necks to try to make open source Mercurial work. Mercurial is a far better project now because of a lot of work contributed by Facebook.

      Rather than upset that Facebook is publishing a logical fork of Mercurial, I feel validated. I see Mercurial’s DNA all over Sapling. They made breaking changes and went into directions that Mercurial wasn’t comfortable with. And I think the result is generally a better polished VCS out-of-the-box.

      Totally agree with you on wanting to disrupt Git. The UI of the tool is too mediocre for its popularity. The world deserves better. But the path to disruption will need to go through the Git wire protocol in order to compete with the Git{Hub,Lab}s social effects. If you can build a superior client that quacks the same to the server, you can slowly disrupt Git’s stronghold on the average developer. And Sapling is a giant step in that direction.

      1. 3

        I kinda agree with you on the second part, but I guess for different reasons. (Can’t say anything about the first part, I’ve contributed to neither).

        But as someone who slightly preferred git over mercurial (back when the winner wasn’t decided yet, I’d say I was 60% git, 40% hg) the amount of bickering between the 2 parties I’ve seen wasn’t fun. Most of us just rolled with the one that won.

        1. 1

          What a great comment, and like you I think that Git’s UI is at odds with its popularity, yet Git is now ubiquitous and I believe we’re at a point where the storage and protocol needs to be a requirement for anything to replace it, with the same level of popularity and adoption. Hg-git sort of did it for me, up until I could not clone repos without using a very specific version of the module, due to a change/bug in the remapping of commit hashes between Git and Hg. Mercurial UI and Git-compatible backend seems like the best pragmatic solution.

        2. 4

          Fun fact: there’s a Git port of hg absorb: https://github.com/tummychow/git-absorb

          1. 6

            It’s worth noting that Sapling’s absorb will work a little better around hunk boundaries due to its use of interleaved deltas. See https://sapling-scm.com/docs/internals/linelog and https://github.com/martinvonz/jj/issues/170

            1. 1

              The first link is broken but this sounds very cool!

              1. 2

                Thanks, fixed.

          2. 1

            Git is definitely not the last word any more than Unix is the last word in operating systems. However, I suspect it’s quite good on a lot of dimensions for a lot of people, so alternatives are going to seem marginal for a very long time.

          3. 18

            This client is so good internally. I’m happy to see it opened up so I can use it after I quit :D

            1. 2

              Any idea how it compares to Mercurial Evolution? Evolution allows some very useful workflows.

              1. 2

                The Sapling front-end is derived from Mercurial and it implements commit evolution directly. You can use sl restack instead of hg evolve, for example. I suspect that distributed evolution isn’t supported in Sapling, since obsmarkers, etc. scale poorly.

                1. 0

                  Certainly warring evolutions is a excellent way of screwing up your life…. It’s in my “Don’t Do That” list…

                  However, it happens and all assistance in cleaning up / or preventing the result is needed

                2. 1

                  Sorry, I’ve not used Mercurial Evolution.

              2. 18

                Too bad the binary name conflicts with another essential Unix tool

                1. 2

                  Which also probably isn’t fun if you actually do mistype.

                  1. 2

                    Yeah, the nerve on these FB people. Seriously, it should have been “sg” , because “sl” is also used by Powershell in Windows.

                    1. 16

                      No, it should obviously be “sap”, which is short enough and also the root (!) of “sapling”.

                      1. 3

                        It also speaks volumes about their culture, completely absent of Unix hacking tradition.

                        One never knows, I might be wrong here, but I don’t see this sapling ever becoming a full grown tree. Successful universally accepted projects have succeeded because they nail a specific solution to a well understood problem and offer a clear direct ovcoua benefit to the user drom day one. I don’t think this is the case here.

                    2. 15

                      My current understanding of the VCS-I-might-consider-using landscape.

                      • git is the current default / dominant tool. It was developed by Linus Torvalds and exploded in popularity thanks to the proprietary hosting platform Github. Everyone uses it. It has a fairly bad user interface, but a good internal model. Microsoft and Google have spent effort making git scale to their humongous repositories, with mixed success.
                      • Mercurial is another tool of the same age as git, it lost the popularity contest despite having sensibly better usability. Few people are using it today due to the dominance of public git forges. Facebook/Meta has spent effort making Mercurial scale to their humongous repositories, with mixed success.
                      • It is easy to extend git with externally-implemented subcommands, so there is a lot of third-party subcommands that are trying to solve some of git’s usability problem (rerere, absorb, etc.). Most of those commands have a very small userbase, because most people stick to the standard commands or use a non-command-line UI that exposes operations differently (for example magit).
                      • Sapling is an in-house tool at Meta, that they finally decided to release as open source. It started as a fork of Mercurial after the Facebook developers failed to convince Mercurial people of some of their changes. It improves performance on massive monorepos, but also provides many usability-oriented features. It claims to be compatible with git – presumably it is possible to use sapling on a standard git repository?
                      • git-branchless, developed by u/arxanas, is inspired by Sapling (it exists since before the public release of Sapling, so presumably the author worked at Facebook at the time). It provides both performance improvement, sharing some code with Sapling, and also similar usability improvements presented as git subcommands. (I don’t understand why/how a single developer remains motivated to implement monorepo-style feature that mostly make sense for large, wealthy companies. Possibly just for the fun of the cool algorithms?)
                      • jj / jujitsu is an alternate version-control system but it also plays nicely with git repositories, also developed by one person. It provides usability enhancements that are related but different from the sapling/branchless ones, and in particular a cool way to handle merge conflicts. It does not seem particularly focused on performance for big repositories, but comments indicate that the jj people are looking at the sapling/branchless features to import them.
                      • pijul is a completely separate version-control system that is, basically, a better darcs. (darcs was a cool version system that lost the popularity contest much more quickly than mercurial, written in Haskell by a physicist and with a nice patch-oriented concept and an exponential merge algorithm that was the horror in practice). In theory pijul is great and has the best theory. The main developer uses Rust, is obsessed with both performance and good designs. Probably the most different from all others in this list (patch-based is fundamentally different from state-based as git and Mercurial are), with fairly different core principles and UI.

                      I see three different use-cases for myself:

                      • If I want to use the standard thing to be able to share my workflow with others easily, I will stick to git. I might want to try some advanced subcommands for complex cases (I’ve used eg. git filter-repo a couple times, and once I also temporarily used rerere, or another tool for rebasing, probably absorb.)

                      • Suppose I want to work on a git repository, but use the best tool available in terms of user experience / user interface. Should I use Sapling, branchless or jj in git mode? I’m not sure, feedback welcome.

                      • If I want to just use the best, most interesting/promising tool, and I don’t care about compatibility with other people, pijul is probably worth a try. In practice I don’t spend much time radically changing my tools these days, so I would probably not do it. (At least it competes with time experimenting with: keyboards, text editors, operating systems, etc.)

                      1. 7
                        Why git-branchless?

                        I don’t understand why/how a single developer remains motivated to implement monorepo-style feature that mostly make sense for large, wealthy companies.

                        I posted about the future of git-branchless here: https://github.com/arxanas/git-branchless/discussions/654

                        I’ll also include some comparisons to Jujutsu while I’m listing features, since I’m familiar.

                        Reason #1: a lot of the client-side features are simply more effective for a patch-stack workflow, irrespective of scaling to monorepos. For example:

                        • Git has poor/no support for anonymous branching. (Jujutsu ✅)
                        • Git has poor/no support for in-memory operations. (Jujutsu ✅)
                        • Git has no support for “sparse” graph logs. If you render git log --graph, it must include a complete topology starting from some set of branches. (Jujutsu ✅)
                        • Git has no way to rebase all descendant branches if you amend an ancestor commit. (Jujutsu ✅)
                          • Most recently, in Git v2.38, you can use git rebase --update-refs, but this only works when you’re at the tip/descendant-most commit of a branch, and it simply doesn’t handle non-linear descendant structures.
                        • git reflog is not a complete solution to undo, and can’t undo many kinds of operations. (Jujutsu ✅)
                        • There are some complicated rebases which can be expressed in git-branchless, but not other tools. (Jujutsu ❌, but improvements here are planned.)
                          • git move --exact is more flexible than any other tool I’m aware of.

                        Reason #2: to improve on some of the Mercurial/Sapling workflows. For example:

                        • No git-branchless operations start merge conflict resolution unless you explicitly pass --merge. I was constantly getting disoriented when I unexpectedly got launched into merge conflict resolution. (Jujutsu ✅: advances here by simply storing the conflicts in commits, so rebases always succeed!)
                        • You can interactively/fuzzy-find and switch to commits. (Jujutsu ❌: no equivalent for now, but it’s not that hard to implement.)
                        • git sync will let you opportunistically rebase local work that merges cleanly, but leave local work in place which doesn’t. (Jujutsu ❓: rebases always succeed — whether you would like the rebases to succeed if they don’t merge cleanly is a matter of preference. I personally don’t!)
                        • Many operations will take working copy snapshots, which lets you undo/restore even uncommitted changes. (Jujutsu ✅: elevates working copy commits to a first-class concept.)
                        • The smartlog can’t be rendered upside-down in Sapling 🙃. (Jujutsu ✅ — thanks to their implementation of my request!)

                        Reason #3: we use Git monorepos anyways.

                        After Facebook, I went to work at Twitter, which uses a Git monorepo. I have a lot of other users using large open-source Git monorepos, such as LLVM, Chromium, and Linux, or other company internal Git monorepos (Stripe, Uber, and many more).

                        Sapling vs Jujutsu vs git-branchless

                        I think Jujutsu has the cleanest set of VCS concepts overall. Unfortunately, it doesn’t scale to large repositories yet. For example, scanning the working copy can take a while. I started some work on fsmonitor support here: https://github.com/martinvonz/jj/pull/362. We can also expect the situation to improve in the future, especially since it can literally reuse the libraries published in Sapling, such as the segmented changelog. (git-branchless already uses that segmented changelog library!)

                        git-branchless will scale better than Jujutsu for now on the client-side if your repository is large. It’s probably the easiest to set up, since you’re expected to use Git commands and you don’t have to immediately learn anything new. It also has a fairly small number of features which Jujutsu doesn’t yet have (see above and the linked post.)

                        Sapling cannot co-locate with Git repositories for now, which is a problem for me in practice. (See the linked post.)

                        1. 2

                          Thanks! Great answer, great issue. I’m impressed by your decision actually, it must not be easy at all to decide to stop focusing on your very own toy and go contribute to another instead. My take-away is that for my needs (that do not include large mono-repos), jj is probably the tool I want to try for “a better-design tool to work with git repositories”.

                        2. 1

                          rerere is bundled with git, but I’m interested if there’s someway to find (the most popular/‘best’) external tools for git.

                          1. 1

                            I somehow assumed that it started as an external command and was upstreamed later, but you are absolutely right. Out of curiosity I looked for the rerere introduction in the git codebase: 8389b52b2a51d5b110b508cc67f0f41f99c30d3f by Junio C. Hamano, February 2006, first released in git 1.2.0. Pretty much a core tool indeed.

                        3. 5

                          The killer feature might even just be the smartlog web UI/VS Code extension. If it’s like the one I remember, it offers workflows which are way more effective than any Git UI implementation I’m aware of (drag-and-drop rebase; multi-way commit split; etc.).

                          1. 2

                            Have you tried Fork?

                            1. 3

                              Yep, I’ve tried most Git UI clients, including Fork. As far as I can tell, it doesn’t have commit splitting? It can stage partial changes, but it doesn’t look like it can split commits that already exist, especially ones that aren’t checked out.

                              That being said, it looks like the Interactive Smartlog offered in Sapling is a clean-room implementation of their internal one and is missing many of the features, including the commit split UI ☹️.

                              1. 1

                                Are we talking something like git-revise with a GUI? That’s a tool for rewriting history without checking it out (including splitting, squashing and reordering), but the interface is like interactive rebase. Its killer feature is no staging/index statefulness or touching the workspace. Which is fairly unique as far as I can tell, but I’ve wanted a quicker drag-and-drop interface.

                                1. 2

                                  Yep. I suspect git-revise is inspired by the same Mercurial workflows which supported in-memory rebases. The author works/ed at Mozilla, where they used a Mercurial monorepo. The git-revise split UI is the same as git add -p if I understand correctly, which I find so difficult to use for my splitting workflows as to be unusable in practice.

                                  1. 1

                                    Interesting analysis.

                                    Yes, what you call the unusable interface is the crux of the matter to me. It’s in dire need of replacement, yet is irreplaceable (at least, has been for me so far) for lack of alternatives. Despite its flaws and impracticality (making you edit large diff hunks that the tool refused to let you split is indeed a very impractical drag-and-drop interface, it makes you solve each conflict twice, and editing an already split hunk is a trap because it will fail to apply), it really lets you do everything and is invaluable for sufficiently hard untangling jobs.

                          2. 4

                            This is the triumph of the slowly rising tide of open source.

                            The lifecycle is clear. A freely available program (git) gets effectively captured by a company built to support the community (GitHub). Company purchased by a larger company (MicroSoft) where updates are not in its best interest. Other large companies (e.g., FaceBook), usually competitors, create a fork for internal use. For whatever reason, the fork is later released to the public. The cycle of life continues.

                            1. 13

                              The fork is actually of Hg and the reason the fork exists is because like most companies of the size of Facebook and with the scale of the tech stack like facebook end up needing a particular type of tooling. Google has something similar to this internally as well that hasn’t been open sourced.

                              This particular thing is not core to the companies value proposition despite being necessary tooling however so offering as Open Source has no almost no downside.

                              1. 9

                                Company purchased by a larger company (MicroSoft) where updates are not in its best interest

                                GitHub has had lots of updates since the Microsoft acquisition, no?

                                1. 2

                                  Yes, such as adding achievements and a proprietary service made using open-source code.

                                  1. 1

                                    It has had many. For example, it provides more barriers to new contributors by refusing to run continuous integration scripts on pull requests. It provides better integration for VSCode with authentication methods that cause more work for new contributors. Everyday brings a new elaborate system to replace a previous step.

                                    Lots of updates.

                                2. 5

                                  A reminder that Facebook has tampered in elections, run psychological experiments on its users without their consent, and is being investigated by the U.N. for their role in genocide in Myanmar.

                                  Everything they release should be considered fruit of the poisonous tree.

                                  1. 1

                                    This seems earily similar to git-branchless, though there’s not a single mention of it. I struggle to find this a coincidence, though I also don’t want to assume anything. The similarity I see is in:

                                    Maybe it’s just a model that’s so obvious the same things have been built… Or perhaps these all existed in Mercurial first, and branchless copied that, as did Sapling?

                                    Edit: Aha! I found the author on the Orange Site, and they mentioned that git-branchless is in fact inspired by Sapling: https://news.ycombinator.com/item?id=33614255

                                    1. 7

                                      Yep, there’s nothing nefarious here 🙂. I used to work at Facebook, and exfiltrated the workflow after I left and had to use Git for my next job. The README of git-branchless used to mention that it was similar to Mercurial workflows at Facebook at Google, although it looks like I’ve since removed that line in an attempt to appeal to developers who didn’t work at BigTech.

                                      Fun fact: I did not work on source control at Facebook; I just really liked the workflow. Prior to Facebook, I had naturally tried to do workflows like stacked branches in Git but found them difficult, and immediately became enamored with the Facebook+Mercurial workflows. Then I knew that I would miss them if I had to leave!

                                      There is a bit of complicated ancestry:

                                      • Mercurial was used by Facebook and Google, at least as a front-end. The interfaces are very similar, although there are a few differences, such as hg sl vs hg xl and hg restack vs hg evolve. I don’t know if there’s anything substantially different.
                                        • Note that stock Mercurial calls the command hg evolve, since the feature is called “commit evolution”, so maybe Google’s Mercurial is less diverged?
                                        • Google’s Mercurial front-end is called “Fig”. I’m not aware of a specific name for Facebook’s.
                                      • Facebook started reimplementing the Mercurial backend, a project called Mononoke. It’s source-available in the Sapling repo. In combination with other components, such as a virtual filesystem called EdenFS, the whole project eventually became a centralized SCM called EdenSCM.
                                      • Before the release of Sapling, they renamed it from Eden because it was confusing to have both EdenSCM and EdenFS.
                                      • I wasn’t aware at any point that they planned to have a Git front-end, so I don’t know how long that’s been in progress.

                                      Also see my post with respect to Sapling here: https://github.com/arxanas/git-branchless/discussions/654

                                      1. 2

                                        A lot of these efforts are probably more intertwined that it seems. The source control world is very small and people generally know each other, show each other ideas, etc. branchless is inspired by Sapling, a lot of what sapling does is inspired by Mercurial, some of the more advanced work (virtual file system) is heavily inspired by Googles internal source control system, and some is novel (absorb, segregated changlog), etc. Similarly JJ is arguable inspired by Sapling. I think it’s more a big pile of ideas that small set of people in the source control community exchange and adapt for their specific needs.

                                      2. 1

                                        Some of the ideas here are interesting, if not super relevant to most companies. The focus on commits vs branches is certainly a different mental model. I like some of the editing options for dealing with a bunch of commits with stack. absorb and fold make a lot of sense to me and I also like backout as a way of saying “I made a public commit and I need to undo it”, which feels easier to understand than the git model. The primary advantage over git really seems to be there, with a much easier to understand “please undo the thing I did”.

                                        The part I’m not really sold on is the rebase vs merge mentality. Obviously if I worked at a place with tens of thousands of developers I would likely feel differently and I have worked on repos with enough traffic that I’m constantly running git pull --rebase --autostash while working on my branch, but I guess I’m missing what is the massive advantage for an average developer. I write my branch, then make a PR/MR for review/approval with my test output. The branch gets merged in.

                                        I certainly understand if your org really gets value out of the cleaner logs (which smartlog seems to do but I need to experiment more, first pass was unimpressive). But it seems like a weird choice to keep the basic concept of PRs with ReviewStack but lose the branch model? I think I’m still missing “how do I make a PR on Sapling where the files in question aren’t local”?

                                        I guess you follow the mercurial strategy of push to a bookmark and then rebase on head.

                                        1. 2

                                          Merges is a perfectly fine model for most companies and projects.

                                          The problem with merges at the scale of FB & Co is that you introduce a much higher commit rate. Commit rate itself is already a scalability problem at that scale and so using rebase reduces your amount of commits and history traversing significantly. In Addition it allows you to make assumption about the form of the history which can facilitate release management, binary caches, etc (e.g serial release number for your trunk builds.) in addition on the server side the internal mercurial sever can do server side rebase at the absence of conflicts, which allows for no need of pull + rebase flows. In the end one way to look at is having the benefits of a centralized system like svn for an Organisation with a local flow like a DVCS, which is fundamentally what I believe sapling is

                                          1. 2

                                            But it seems like a weird choice to keep the basic concept of PRs with ReviewStack but lose the branch model?

                                            I think this is more of a concession that practically everyone uses GitHub publicly, and GitHub uses PRs, which don’t work well with stacked diffs, so they tried to wrap it with a tool to make it better. Other tools, like Phabricator, Gerrit, etc. handle stacked diffs better. There are also tools like Graphite which try to improve stacked diffs on GitHub.

                                          2. 1

                                            After discovering git-machete I don’t see the reason to use anything else (except Pijul when it has larger adoption)

                                            1. 1

                                              I tried to build it but ran out of disk after 30 minutes of building and > 2 GB of artifacts. I guess it’s not meant for individuals on old laptops maybe. :)

                                              Will try again when there’s a binary package.

                                              1. 1

                                                This is dang cool, but IMO the untapped win here is (a) a stacked-commit SCM (which this is, yay!) with a concurrently developed (b) self-hostable, repo-host-agnostic code review tool built for stacked commits. So open source ReviewStack and you’ve got a stew going!