I’ve come to the conclusion that I’ll not devote time to learning any other version control system, probably ever.
I just don’t have the same issues with git that some people seem to have. I learned the git model long ago and I can get out of any real trouble without really thinking about it. The command line experience for me is just muscle memory at this point, I don’t really care that the commands could be streamlined.
I gave Jujutsu a good once over, read @steveklabnik tutorial and everything, played with it for a bit, but after some time I was just left wondering “why am I doing this?”. It’d just take me much longer to learn JJ than any time it’d save me for the rest of my life.
Maybe I’ll be forced to eat my words one day. I applaud the idea of innovating in this space, however. It seems the newer generation has plenty of trouble with git, something that helps them is useful.
I think git’s probably the local optimum for the next 10-20 years, so I think you’re making a safe bet.
All these experiments suggest to me that people are feeling genuine friction with git though. I still think a “theory of patches” (perhaps related to the one from Darcs) could motivate a new version control system that permits new uses, enables users to get further with less work and fear, and maybe avoids most of the pitfalls. I don’t think we’re there yet though.
These experiments also tell me that people agree that git has a very good internal model and a very clumsy interface. My own experience of git is like yours. It isn’t easy, but I’m used to it. The replacement will have to make a material improvement day-to-day, larger than a nicer interface. It is, today, really hard for me to imagine feeling the kind of leap that was subversion to git (or mercurial) emerging from a git-to-new-porcelain transition.
These experiments also tell me that people agree that git has a very good internal model and a very clumsy interface.
As I’ve said here multiple times here and elsewhere, I’m a git fanboy. I personally don’t understand any of these efforts to revamp git’s UI. But I’m a grumpy old man who thinks craftspeople should learn to master their tools when these tools are ubiquitous and good enough. So don’t listen to me :) .
However, even though I agree that the git internals are amazing and well designed, I have to object to the sophism “These experiments also tell me that people agree that git has a very good internal model.”
With this type of logic, I could also state “The effort put into the Wine project tells me that people agree that Windows has a very good graphics API and very good binary format.”
I don’t think it’s the case. Whether it is for Wine or for jj and xit. I think all of these projects are targeting compatibility, and trying to prevent flamewars and bikeshedding over which VCS to use for specific project. I believe the idea behind using the git protocol is: you use xit, they use jj, I use git, everybody is happy.
Although I agree with your disagreement to the logic, I personally find Git’s object model to be rather intuitive. The only sound complaint I’ve encountered recently, imo, is that Git is unable to store directories, but as far as my understanding goes that could be represented as an empty tree object, if Git were to be patched to support that representation.
Notwithstanding that, I do believe that the reason jj/xit/got use the same plumbing as Git is about compatibility.
Also, it’s not necessarily an issue with the newer generation. Young-ish programmers that I know are generally comfortable with Git, including me, while it seems like people used to Mercurial/SVN/CVS who have a hard time groking it.
As I said, I’m a git fanboy. I do think that the git object model is great, but I also see issues. One of my biggest issue is that git is incapable to track renames and copies.
If I rename a file and largely modify that file in the same commit, git will think that I deleted a file called foo.txt, and committed a new different file (written from scratch) called bar.txt. If I want to merge modifications of foo.txt on the lines I didn’t touch, I will have merge conflicts.
Copy is another issue. If I copy foo.txt to bar.txt and modify bar.txt a little bit in one commit, the git object model will think that you just created a new file bar.txt from scratch. If you want to merge work that modified foo.txt before you copied it, the changes will not be propagated to bar.txt.
This is a very intentional decision by git, compared to other systems that tracked renames/copies with metadata (svn and hg, I think).
git is content-based, and the metadata is derived, rather than stored.
It is counterintuitive, but I think git’s choice is better, and I thought most VCS experts basically came around to that conclusion (?)
This rename and branch metadata issue was a big argument around 2007 when git was gaining popularity, but it seems to have been mostly settled …
I think one main problem is if you have a branch-heavy workflow, which turned out to be extremely useful in practice, then the most general solution is for renames to be derived from content. It just does a better job in practice.
Another under-rated thing is that git repositories are extremely backward compatible. A “branch” is almost nothing to the git model, just as a “rename” is literally nothing. This makes A LOT of things work more smoothly – the core is more elegant and compatible
If I want to merge modifications of foo.txt on the lines I didn’t touch, I will have merge conflicts.
I don’t personally run into that problem, and I rename all the time … I am not familiar with the details, but my guess is it’s because I commit very often
And also there are like a half a dozen merge strategies in git, and I guess the experts on big projects use those (but I don’t)
IME the alternatives always run into problems too … branches are cheap in git, but expensive in other systems – both in terms of time and cognitive overhead.
i.e. it doesn’t make sense to criticize the object model on theoretical grounds, when the systems that did the theoretically better thing actually behave worse in practice
I am not sure but Linus might have mentioned this at his ~2007 talk on git at Google, which was famously a bit abrasive (I was there):
And again, to change my mind, I would have to see a system that does better in practice … IME it’s a non-issue with git, and git does significantly better than its predecessors.
Most of the time using git mv instead of mv works for me although there are still a bunch of caveats. IMO this is a porcelain issue rather than an object model one
Pretty naturally, people who have used the feature of reliably-commit-attached branches have more issues with Git not allowing syncing the reflog than the people who were told no-real-branches is the only possibility and formed their model across this lack of functionality!
Yeah same here. To me, git just gets a lot of work done. It was a good 20%-50% permanent improvement over svn and hg in my mind, and that’s a huge win compounded over a decade.
I think many issues can be solved by simply committing more often, which I always did with every VCS. And the others issues I hit along ago and have workarounds for. Stack Overflow is invaluable here.
It’s funny to me that I don’t think I would be happy with git without Google/Stack Overflow, which is a sign of deficiency, but it’s still the best thing. (And yes I treat the command names and flags as opaque crap to memorize.)
I already spend too much time faffing with various tools, so for me git and vim are “set and forget”. On the other hand, I seem to have more problems with distros (Ubuntu was sort of ruined IMO, Debian is OK but can use improvement), and browsers (Firefox and Chromium both have issues).
Interesting, I used to think the same way. I considered myself a git power user and had no problems with it at all. I was very productive, comfortable with editing my history and could fix all my coworkers’ problems. Then jj clicked for me and I couldn’t go back.
I was still slightly stuck with git because it’s more performant in huge repos like the kernel. So I frantically searched for ways to get jj’s productivity and power in git. I finally found the config option rebase.rebaseMerges, which is the crucial missing piece to mimic the megamerge workflow of jj. Other config options are obviously essential too, but rebaseMerges is never really talked about online as far as I can tell.
It’s not at all perfect. git is still more manual work and more friction than jj. And some things still annoyingly don’t work at all. (Why can’t git do an octopus merge when there is a conflict?? jj can.)
So git can definitely be usable and productive. But if a power user like me, with a big config polished over years, cannot get the same power that jj gives you out-of-the boxes and with an intuitive CLI… I definitely see jj as the future “default” VCS people will be using and recommending.
While git is certainly cumbersome from time to time, I’ve learned to live with it. But it doesn’t stop to amaze me how many people just put up with GitHub. Pushing a branch, going to web, clicking a button, filling out a form? Why isn’t pushing a branch enough? And then reviewing online in a GUI so slow it takes a second to open a side-panel with the list of affected files, at least on large PRs. This all feels so backwards.
The pronunciation hint is no help, X followed by a vowel is always pronounced as in Xiaomi in my native language, so I can only read it as shit, it’s beyond automatic =P
I was looking in xit’s docs for a comparison to Jujutsu – specifically, why a whole new version control system is needed instead of a contribution to Jujutsu. The only direct mentions I found were in xit/docs/compat.md, which is about compatibility with Git:
[Jujutsu’s] approach to compatibility, however, is very different than xit’s. Jujutsu attains git compatibility by using the same on-disk repo format as git. This gives it a few advantages: […]
The repo format used by xit is completely different. It creates a .xit directory at the root of your project, and its internals have nothing in common with the .git directory you are used to.
I believe that a new on-disk repo format is critical to fixing many of git’s limitations. In particular, I think better merging and better large file support can’t be attained without moving to a new repo format. […] With this in mind, xit achieves git compatibility at the network layer instead of the storage layer.
This comparison fails to acknowledge that Jujutsu already supports the concept of multiple backends. Jujutsu’s Git backend is its better-supported one; “the native backend is used for testing purposes only”. Neither of Jujutsu’s current backends provide xit’s better large file support through using CDC, but Jujutsu’s roadmap already mentions the idea of implementing that technique. So Jujutsu will probably eventually also support usage with Git network compatibility but a different on-disk repo format.
xit’s goal of “better merging” seems to be a bigger difference. That same xit document mentions that Jujutsu uses the same merge algorithm as Git: three-way merge. The document says that xit, on the other hand, wants to “reduce the number of merge conflicts that occur in the first place”. It links to xit/docs/patch.md, which has a comparison of snapshot-based and patch-based version control.
In xit, there is a history of commits that closely mirrors git. Additionally, it computes patches for all changes to text files, which it uses when merging or cherry-picking. In this way, xit gets the primary benefit of patch-based systems (better merges), while using snapshots for everything else.
Jujutsu, like Git, is purely snapshot-based. I don’t see any mention of patch-based merging in Jujutsu’s docs or roadmap.
Merge conflicts depend a lot on your workflow. A good workflow can eliminate them.
Git is not purely snapshot-based. It very frequently calculates patches (like when you run git log) and uses them for patch-based shuffling of changes, as in cherry-pick, rebase, format-patch, am, ….
The workflow I like is to have a short feature branch for a merge request, that is based very close to the head of the main branch, and typically rebased onto the head before it gets merged. Then if there are any conflicts, they appear when applying the relevant patch, not when creating the merge commit.
Merges should always be conflict-free, so the three-way merge algorithm is basically irrelevant.
Jujutsu, like Git, is purely snapshot-based. I don’t see any mention of patch-based merging in Jujutsu’s docs or roadmap.
The jj copy-tracing proposal involves adopting certain patch-like semantics at the repo/file-level, although not the line/byte level. It’s interesting because it also requires keeping additional copy-tracing metadata alongside what you would expect from the Git object model.
When it comes to better merge algorithms, I’m more excited about things like mergiraf. Whatever small benefits an improved textual approach might bring, I expect that to pale in comparison to a merge based on an AST. Mergiraf is in early development as far as I can tell, but the idea is solid. I think it’s just a matter of time until a textual approach is basically irrelevant for day-to-day work.
Unrelated to the project itself, but is the practice of putting each character of the page in its own span a technique for foiling AIs trying to crawl that content?
Universal undo is good, but there is a use-case where you need to not have that. Secrets committed to the repo which need to be expunged.
Yes - you should rotate secrets if they are made public etc. However, if you have a private repo you are sweeping before making public, it can be convenient to be able to forcibly edit history.
(A reasonable person might say, “if you can’t rotate secrets, then that is your bigger problem and you should fix that”. I kind of agree, but I can also imagine secrets which aren’t under my control (e.g. API keys (or - heaven forfend - user+pw) for a 3rd party service my code consumes) which are more challenging to rotate. I think the world needs a “pretend this didn’t happen, but otherwise preserve history” feature in VCSs).
jj pretty much already works this way. Once you push to a (git) remote, only the reachable commits are preserved. In other words, universal undo is a local-only feature.
On the one hand, if you want both universal undo and relaying changes via external git host, you probably already buy git’s unfortunate idea that some local things are unsyncable at all. Even without that, choosing not to export some of the undo history should be a reasonable thing to achieve, no need to follow the tradeoffs of old MS Word formats.
On the other hand, there are cases when you want to scrub even the local copy; but normally those are enough of a break-glass case to permit clunky and confirmation-requesting commands to do the exceptional dropping of undo history.
(Then again, the author doesn’t seem to promise unlimited undo explicitly, you might be able simply to exhaust the undo depth…)
I work quite closely with FastCDC for the past year. The problem with FastCDC is that the paper is not specific on how it should be implemented. That caused different implementation out in the wild using different parameters, causing 1 big blob to be chunk differently using different implementations. So let say we have a xit-zig implementation and a xit-rs implementation, it’s likely that each will chunk a tarball in different ways and thus, reducing the effectiveness of the deduplication between chunks.
Secondly, git-lfs is quite open about how to extend it. https://github.com/git-lfs/git-lfs/blob/main/docs/extensions.md
You can implement your own client-side storage, your own transfer protocol as well as server-side storage on top of the existing implementation. So it’s not hard to apply a FastCDC layer on top of this.
Finally, as I administered a handful of git servers for large enterprises for the last few years and have to support x00-x000s of users, I think only 2/5 points listed are relevant to end users today (it’s number 1 and 4, git-compatibility and large blob support). Would love to be proven wrong though.
Some ideas (note that I merely have read blog post and didn’t dig futher):
This may be good idea to fully replicate git’s CLI. At least as an option. This will help spreading the project
Migrate away from SHA1. It is broken. It is one very unfortunate git’s design mistake. Also, you should change hashes regularly anyway: https://valerieaurora.org/hash.html . (Well, actual migrating from SHA1 will likely break github compatibility, so, of course, it makes sense to support SHA1 for now. But please support other hashes, too. Don’t repeat git’s mistake: git simply hardcoded SHA1 everywhere originally.)
In the past I spent a lot of time researching CDC-and-deduplication. My findings are here: https://github.com/borgbackup/borg/issues/7674 . Short overview of FOSS solutions is here: https://lobste.rs/s/0itosu/look_at_rapidcdc_quickcdc#c_ygqxsl . In short, existing solutions are under-optimized, and there is a lot of low handling fruit here. I was able very easily create very small program in Rust, which beats existing deduplication solutions by wide margin (but my program doesn’t use CDC). So I suggest reading my ideas and comparing speed of your solution with other solutions
Patch-based merging seems to be killer feature (assuming it works well). So, I suggest making it main ad strategy. Linux devs often maintain their patchsets as series of patch files, not as git branches, exactly because git merging doesn’t work well. So, reach Linux devs and tell them about your tool. In particular, person number 2 in Linux, Greg KH, maintainer of stable Linux trees, stores his stable trees as series of patch files in git (aaaah!). Here he describes his workflow: http://www.kroah.com/log/blog/2019/08/14/patch-workflow-with-mutt-2019/ . Key parts are these: “The stable kernel tree, while under development, is kept as a series of patches that need to be applied to the previous release. This series of patches is maintained by using a tool called (quilt)… Anyway, the stable patches are kept in a quilt series in a repository that is kept under version control in git (complex, yeah, sorry.) That queue can always be found (here)”. Same applies to a lot of Debian packages. For example, gcc (and lots of other Debian packages) is, again, maintained as patches-stored-in-git. See here https://salsa.debian.org/toolchain-team/gcc/-/tree/gcc-14-debian/debian/patches . I think this is, again, because of git merge and git rebase problems. So, spread your xit as tool to solve all these problems. Of course, it helps if you are CLI-compatible with git
“If the first byte is 0, it is uncompressed; if it is 1, it is zlib-compressed”. I suggest moving to zstd, it is better in every way (faster and smaller). Also, zstd may be good in compressing binary files (at least I hope zstd doesn’t do them sufficiently larger). “While xit has compression support, it currently disables it even for text files”. Try zstd -0, it is fast enough, while giving substantial compression for text files. If it is too slow, try lz4, it is even faster
“Want to find the descendent(s) of a commit? Uhhh…well, you can’t”. As pointed out on lobsters, you can see descendants: https://lobste.rs/s/mltpfg/xit_is_coming#c_cnwsps . (But I understand your point, i. e. you argue that we need separate data structure for this)
Feel free to ask any questions.
Also: even if you implement all these, I still do not plan to use xit. (I’m not trying to insult you, I just am trying to be honest here about my motivations.)
I guess what would make it stand out is supporting “format before diff” with user-configurable formatters (ruff, zig fmt, go fmt, …) before trying to diff code.
It would probably reduce the amount of merge conflicts drastically as well.
Also having something like difftastic would probably help, as it’s line agnostic diffing.
but i guess this also applies to git or any other VCS as well
But for me, the bigger problem is that git’s data structures are just not that great. The core data structure it maintains is the tree of commits starting at a given ref. In simple cases it is essentially a linked list, and much like the linked lists you may have used, it can’t efficiently look up an item by index. Want to view the first commit? Keep following the parent commits until you find one with no parent. Want to find the descendent(s) of a commit? Uhhh…well, you can’t.
I’ve come to the conclusion that I’ll not devote time to learning any other version control system, probably ever.
I just don’t have the same issues with git that some people seem to have. I learned the git model long ago and I can get out of any real trouble without really thinking about it. The command line experience for me is just muscle memory at this point, I don’t really care that the commands could be streamlined.
I gave Jujutsu a good once over, read @steveklabnik tutorial and everything, played with it for a bit, but after some time I was just left wondering “why am I doing this?”. It’d just take me much longer to learn JJ than any time it’d save me for the rest of my life.
Maybe I’ll be forced to eat my words one day. I applaud the idea of innovating in this space, however. It seems the newer generation has plenty of trouble with git, something that helps them is useful.
I think git’s probably the local optimum for the next 10-20 years, so I think you’re making a safe bet.
All these experiments suggest to me that people are feeling genuine friction with git though. I still think a “theory of patches” (perhaps related to the one from Darcs) could motivate a new version control system that permits new uses, enables users to get further with less work and fear, and maybe avoids most of the pitfalls. I don’t think we’re there yet though.
These experiments also tell me that people agree that git has a very good internal model and a very clumsy interface. My own experience of git is like yours. It isn’t easy, but I’m used to it. The replacement will have to make a material improvement day-to-day, larger than a nicer interface. It is, today, really hard for me to imagine feeling the kind of leap that was subversion to git (or mercurial) emerging from a git-to-new-porcelain transition.
As I’ve said here multiple times here and elsewhere, I’m a git fanboy. I personally don’t understand any of these efforts to revamp git’s UI. But I’m a grumpy old man who thinks craftspeople should learn to master their tools when these tools are ubiquitous and good enough. So don’t listen to me :) .
However, even though I agree that the git internals are amazing and well designed, I have to object to the sophism “These experiments also tell me that people agree that git has a very good internal model.”
With this type of logic, I could also state “The effort put into the Wine project tells me that people agree that Windows has a very good graphics API and very good binary format.”
I don’t think it’s the case. Whether it is for Wine or for jj and xit. I think all of these projects are targeting compatibility, and trying to prevent flamewars and bikeshedding over which VCS to use for specific project. I believe the idea behind using the git protocol is: you use xit, they use jj, I use git, everybody is happy.
Although I agree with your disagreement to the logic, I personally find Git’s object model to be rather intuitive. The only sound complaint I’ve encountered recently, imo, is that Git is unable to store directories, but as far as my understanding goes that could be represented as an empty tree object, if Git were to be patched to support that representation.
Notwithstanding that, I do believe that the reason jj/xit/got use the same plumbing as Git is about compatibility.
Also, it’s not necessarily an issue with the newer generation. Young-ish programmers that I know are generally comfortable with Git, including me, while it seems like people used to Mercurial/SVN/CVS who have a hard time groking it.
As I said, I’m a git fanboy. I do think that the git object model is great, but I also see issues. One of my biggest issue is that git is incapable to track renames and copies.
If I rename a file and largely modify that file in the same commit, git will think that I deleted a file called
foo.txt, and committed a new different file (written from scratch) calledbar.txt. If I want to merge modifications offoo.txton the lines I didn’t touch, I will have merge conflicts.Copy is another issue. If I copy
foo.txttobar.txtand modifybar.txta little bit in one commit, the git object model will think that you just created a new filebar.txtfrom scratch. If you want to merge work that modifiedfoo.txtbefore you copied it, the changes will not be propagated tobar.txt.This is a very intentional decision by git, compared to other systems that tracked renames/copies with metadata (svn and hg, I think).
git is content-based, and the metadata is derived, rather than stored.
It is counterintuitive, but I think git’s choice is better, and I thought most VCS experts basically came around to that conclusion (?)
This rename and branch metadata issue was a big argument around 2007 when git was gaining popularity, but it seems to have been mostly settled …
I think one main problem is if you have a branch-heavy workflow, which turned out to be extremely useful in practice, then the most general solution is for renames to be derived from content. It just does a better job in practice.
Another under-rated thing is that git repositories are extremely backward compatible. A “branch” is almost nothing to the git model, just as a “rename” is literally nothing. This makes A LOT of things work more smoothly – the core is more elegant and compatible
I don’t personally run into that problem, and I rename all the time … I am not familiar with the details, but my guess is it’s because I commit very often
And also there are like a half a dozen merge strategies in git, and I guess the experts on big projects use those (but I don’t)
IME the alternatives always run into problems too … branches are cheap in git, but expensive in other systems – both in terms of time and cognitive overhead.
i.e. it doesn’t make sense to criticize the object model on theoretical grounds, when the systems that did the theoretically better thing actually behave worse in practice
I am not sure but Linus might have mentioned this at his ~2007 talk on git at Google, which was famously a bit abrasive (I was there):
https://www.youtube.com/watch?v=idLyobOhtO4
I think it’s one of those things where he had the contrary opinion at the time, was abrasive about it, and turned out to be right
Here is some rant I googled – I didn’t read the whole thing – but it is obviously extremely intentional and purposeful
https://gist.github.com/borekb/3a548596ffd27ad6d948854751756a08
And again, to change my mind, I would have to see a system that does better in practice … IME it’s a non-issue with git, and git does significantly better than its predecessors.
Most of the time using
git mvinstead ofmvworks for me although there are still a bunch of caveats. IMO this is a porcelain issue rather than an object model onePretty naturally, people who have used the feature of reliably-commit-attached branches have more issues with Git not allowing syncing the reflog than the people who were told no-real-branches is the only possibility and formed their model across this lack of functionality!
Yeah same here. To me, git just gets a lot of work done. It was a good 20%-50% permanent improvement over svn and hg in my mind, and that’s a huge win compounded over a decade.
I think many issues can be solved by simply committing more often, which I always did with every VCS. And the others issues I hit along ago and have workarounds for. Stack Overflow is invaluable here.
It’s funny to me that I don’t think I would be happy with git without Google/Stack Overflow, which is a sign of deficiency, but it’s still the best thing. (And yes I treat the command names and flags as opaque crap to memorize.)
I already spend too much time faffing with various tools, so for me git and vim are “set and forget”. On the other hand, I seem to have more problems with distros (Ubuntu was sort of ruined IMO, Debian is OK but can use improvement), and browsers (Firefox and Chromium both have issues).
Interesting, I used to think the same way. I considered myself a git power user and had no problems with it at all. I was very productive, comfortable with editing my history and could fix all my coworkers’ problems. Then jj clicked for me and I couldn’t go back.
I was still slightly stuck with git because it’s more performant in huge repos like the kernel. So I frantically searched for ways to get jj’s productivity and power in git. I finally found the config option
rebase.rebaseMerges, which is the crucial missing piece to mimic the megamerge workflow of jj. Other config options are obviously essential too, butrebaseMergesis never really talked about online as far as I can tell.It’s not at all perfect. git is still more manual work and more friction than jj. And some things still annoyingly don’t work at all. (Why can’t git do an octopus merge when there is a conflict?? jj can.)
So git can definitely be usable and productive. But if a power user like me, with a big config polished over years, cannot get the same power that jj gives you out-of-the boxes and with an intuitive CLI… I definitely see jj as the future “default” VCS people will be using and recommending.
While git is certainly cumbersome from time to time, I’ve learned to live with it. But it doesn’t stop to amaze me how many people just put up with GitHub. Pushing a branch, going to web, clicking a button, filling out a form? Why isn’t pushing a branch enough? And then reviewing online in a GUI so slow it takes a second to open a side-panel with the list of affected files, at least on large PRs. This all feels so backwards.
It’s possible to check out GitHub PRs locally…
Except that if you merge that ref (or the actual source ref) directly with Git, GitHub won’t detect the PR as being merged
What a headache
I have to agree, I have no problems with git.
All these alternative systems seems to come from people who want simpler tools and do not understand git.
Do you really think authors of these alternative systems do not understand git? People like @martinvonz?
On the one hand, it’s neat to see people willing to experiment in this space!
On the other, it’s hard for me personally to imagine using anything other than
jjnow.The name seems unfortunate to me, as well. It’s pretty unappealing to me to use the word “zit” (where X often gets pronounced like a Z).
On the plus side the X is not meant to be pronounced like in Xiaiomi :^)
The pronunciation hint is no help, X followed by a vowel is always pronounced as in Xiaomi in my native language, so I can only read it as shit, it’s beyond automatic =P
Yeah, I’d have thought pronouncing it like ‘exit’ would have made more sense.
I’d be cautiously curious about the idea, but it even says:
I was looking in xit’s docs for a comparison to Jujutsu – specifically, why a whole new version control system is needed instead of a contribution to Jujutsu. The only direct mentions I found were in xit/docs/compat.md, which is about compatibility with Git:
This comparison fails to acknowledge that Jujutsu already supports the concept of multiple backends. Jujutsu’s Git backend is its better-supported one; “the native backend is used for testing purposes only”. Neither of Jujutsu’s current backends provide xit’s better large file support through using CDC, but Jujutsu’s roadmap already mentions the idea of implementing that technique. So Jujutsu will probably eventually also support usage with Git network compatibility but a different on-disk repo format.
xit’s goal of “better merging” seems to be a bigger difference. That same xit document mentions that Jujutsu uses the same merge algorithm as Git: three-way merge. The document says that xit, on the other hand, wants to “reduce the number of merge conflicts that occur in the first place”. It links to xit/docs/patch.md, which has a comparison of snapshot-based and patch-based version control.
Jujutsu, like Git, is purely snapshot-based. I don’t see any mention of patch-based merging in Jujutsu’s docs or roadmap.
Merge conflicts depend a lot on your workflow. A good workflow can eliminate them.
Git is not purely snapshot-based. It very frequently calculates patches (like when you run
git log) and uses them for patch-based shuffling of changes, as in cherry-pick, rebase, format-patch, am, ….The workflow I like is to have a short feature branch for a merge request, that is based very close to the head of the main branch, and typically rebased onto the head before it gets merged. Then if there are any conflicts, they appear when applying the relevant patch, not when creating the merge commit.
Merges should always be conflict-free, so the three-way merge algorithm is basically irrelevant.
The jj copy-tracing proposal involves adopting certain patch-like semantics at the repo/file-level, although not the line/byte level. It’s interesting because it also requires keeping additional copy-tracing metadata alongside what you would expect from the Git object model.
When it comes to better merge algorithms, I’m more excited about things like mergiraf. Whatever small benefits an improved textual approach might bring, I expect that to pale in comparison to a merge based on an AST. Mergiraf is in early development as far as I can tell, but the idea is solid. I think it’s just a matter of time until a textual approach is basically irrelevant for day-to-day work.
Nice summary, thanks!
Unrelated to the project itself, but is the practice of putting each character of the page in its own span a technique for foiling AIs trying to crawl that content?
Oh so that’s why reader mode doesn’t work even when I force-enable it…
I guess it’s to help keep all the letters in a regular grid to get that terminal look.
It confuses my screen reader.
Oh, I was wondering why there were spaces right of the text on all lines, this is even worse
Universal undo is good, but there is a use-case where you need to not have that. Secrets committed to the repo which need to be expunged.
Yes - you should rotate secrets if they are made public etc. However, if you have a private repo you are sweeping before making public, it can be convenient to be able to forcibly edit history.
(A reasonable person might say, “if you can’t rotate secrets, then that is your bigger problem and you should fix that”. I kind of agree, but I can also imagine secrets which aren’t under my control (e.g. API keys (or - heaven forfend - user+pw) for a 3rd party service my code consumes) which are more challenging to rotate. I think the world needs a “pretend this didn’t happen, but otherwise preserve history” feature in VCSs).
jj pretty much already works this way. Once you push to a (git) remote, only the reachable commits are preserved. In other words, universal undo is a local-only feature.
On the one hand, if you want both universal undo and relaying changes via external git host, you probably already buy git’s unfortunate idea that some local things are unsyncable at all. Even without that, choosing not to export some of the undo history should be a reasonable thing to achieve, no need to follow the tradeoffs of old MS Word formats.
On the other hand, there are cases when you want to scrub even the local copy; but normally those are enough of a break-glass case to permit clunky and confirmation-requesting commands to do the exceptional dropping of undo history.
(Then again, the author doesn’t seem to promise unlimited undo explicitly, you might be able simply to exhaust the undo depth…)
I work quite closely with FastCDC for the past year. The problem with FastCDC is that the paper is not specific on how it should be implemented. That caused different implementation out in the wild using different parameters, causing 1 big blob to be chunk differently using different implementations. So let say we have a xit-zig implementation and a xit-rs implementation, it’s likely that each will chunk a tarball in different ways and thus, reducing the effectiveness of the deduplication between chunks.
Secondly, git-lfs is quite open about how to extend it. https://github.com/git-lfs/git-lfs/blob/main/docs/extensions.md You can implement your own client-side storage, your own transfer protocol as well as server-side storage on top of the existing implementation. So it’s not hard to apply a FastCDC layer on top of this.
Finally, as I administered a handful of git servers for large enterprises for the last few years and have to support x00-x000s of users, I think only 2/5 points listed are relevant to end users today (it’s number 1 and 4, git-compatibility and large blob support). Would love to be proven wrong though.
Kneejerk response: There’s already much nicer DVCSs, it’s just that…
Oh okay. Well I think e.g. Pijul is nicer than git becau…
Okay I’m interested!
As far as the name goes, a little tool called “git” did just fine.
So did the Nintendo Wii. But it would be nice to not have to override one’s cringe response.
I sent my thoughts to the author ( https://github.com/radarroark/xit/issues/9 ). I will reproduce them here
Thanks for this project!
Some ideas (note that I merely have read blog post and didn’t dig futher):
git mergeandgit rebaseproblems. So, spread your xit as tool to solve all these problems. Of course, it helps if you are CLI-compatible with gitzstd -0, it is fast enough, while giving substantial compression for text files. If it is too slow, try lz4, it is even fasterFeel free to ask any questions.
Also: even if you implement all these, I still do not plan to use xit. (I’m not trying to insult you, I just am trying to be honest here about my motivations.)
Also, there is discussion of your project here https://lobste.rs/s/mltpfg/xit_is_coming . If you want, I can give you invite
I’m really excited about CDC for blobs! (git LFS doesn’t support this, and it makes using git for blender projects annoying.)
This made me think.
I guess what would make it stand out is supporting “format before diff” with user-configurable formatters (ruff, zig fmt, go fmt, …) before trying to diff code.
It would probably reduce the amount of merge conflicts drastically as well.
Also having something like difftastic would probably help, as it’s line agnostic diffing.
but i guess this also applies to git or any other VCS as well
From https://github.com/radarroark/xit/blob/master/docs/db.md :
Ha! This is very good selling point for xit!
Uhhh… well, you use the commit-graph file
(one of the results of Derrick Stolee’s performance improvement work)
How do the internals compare with Pijul?
the fact that it’s git compatible makes it very interesting. point 5 sounds impossible, but worthy of investigation