So to summarize, this seems like the author wants source control suitable for absolutely codebases. There’s a lot more than just a VCS needed to make such codebases manageable (although proper VCS support for understanding sub-parts of the codebase would help with things like CI).
I agree that git does not in fact shine for these usecases, but I generally find it the most capable source control system for things that matter to me - understanding history, safely merging code, and exceptionally, manipulating the repo history, with a minimum of foot guns.
I’ve found cvs, and svn, to be quite poor for collaboration. I’ve found mercurial to have plenty of time sucking traps (it may be too powerful). While I don’t think git is the end of all source control, I think the reasons it’s won are because it’s the best tool for most users.
While I do agree that CVS And SVN are not as good in collaboration, I do feel that Git has way more foot guns than most other version control systems. I include here a few examples: terminology is uses the same words as other VCS but do different things (e.g. hg revert, svn revert revert a file, git revert removes a commit), it does not protect you from rewriting public history through rebsae (Mercurial has built in mechanisms to prevent this), and puts a ton of different behaviours into overloaded commands such as checkout. Git works great if you understand it’s underlying mechanics, but that’s already where Git is at fault. Why do I need to know the datastructures of my version control system? I don’t need to know if Vim or Emacs uses a rope, or how they save their swap files, I don’t need to know how Subversion databases look like or how clangs AST is internally working. I use the tool. Somehow, to navigate Git safely, most often I hear : “learn it”. So I belief in many ways, if you approach GIt as you approach most software, it’s incredible difficult and has tons of foot guns. It’s a bit like given you a nuclear reactor. It’s plenty powerful, but you also kind of need a nuclear physics degree to run it.
I use SVN at work, and git for my own stuff, and I never once had an issue with svn revert vs git revert or svn checkout vs git checkout. In fact, I’ve gotten into bad situations with SVN that are easy to fix in git (mostly related to adding files by mistake). Then again, I find git more pleasant to work with than SVN.
Honestly I think Distributed Version Control System(DVCS) is not the correct way to move forward.
Same with email, docs, excel sheets,.. to your compute servers… all moving to the cloud. What Github and Gitlab is proving is a well oiled, well managed Centralized Version Control System is much more desirable:
You get more than just version control, you get code review, issue trackers
You get CI and CD well integrated
You get better scaling for infrastructure that you dont have to manage.
It’s 2021, everybody who code would do it with an internet connection. Stackoverflow, google, hackernews, reddit, lobste.rs etc… A connected VCS UX is much more desirable.
The moment we recognize this fact would be would be much better off building something for an online experience.
And that comes with a lot of assumption that you can make about storage, scalability, distributed, latency etc…
I personally am keeping a close watch on https://github.com/facebookexperimental/eden/ as its well built based on that philosophy. I think this is the most advance, best invested Open Source VCS solution we have to date.
I’ve used CVS, SVN and git. Of the three, I find git the easiest to set up a new repo. It’s just git init. With CVS, it was more work involved (both ends) and as a result, I only ever had two personal projects in CVS. I recall setting up SVN for my own use required even more insane setup and never bothered with it after getting it installed.
And not everybody is comfortable with The Cloud(TM).
Using CVS and SVN as the basis of comparison wont do Centralized VCS justice.
In my mind, it would be something closer to GitPods or Github Codespaces, where you get a cloud instance provisioned with all the needed dependencies for your development plus an IDE server. You can either connect to that IDE server using an IDE client (web or actual IDE).
The idea is that IDE, VCS, CI, CD, Monitoring, Alerts, Logs should/could all be well-oiled integrated when they are built together. From a user perspective, it should only be the IDE frontend which they interact with, not Git nor CVS nor Mercurial nor SVN.
For a VCS to scale, you need a server hosting solution that integrate with different component that would enable scaling:
Object Storage for large files
Graph Database (or something similar) to maintain relationship between ‘branches/bookmarks’ and between change sets,…
You also want a more mature client solution, with easy to learn UX while having knobs that let power users excel.
Finally you need a migration path for existing code repository to move to this new solution. I.e. converting from git/svn/mercurial to my-ideal-vcs is a must have.
I suppose I’ll chip in the obligatory mention of Fossil. While it isn’t perfect, it can be extended to accommodate most of the author’s criteria, as long as one is willing to write their own custom extensions. For large files, a different mindset is needed; plain git must be substituted with git-annex.
Maybe I’m misunderstanding the author, but something feels off about the “Push/pull bottleneck”. If you have conflicts with what is upstream, you must resolve them, regardless of what VCS you’re on. Comparing my experience with Git and SVN here, I much prefer Git; it has git fetch. So I’m able to easily see incoming conflicts without immediately incorporating them into my workdir.
As far as I know, SVN gives me checkout, which will force me to resolve conflicts right then and there. My experience is that this encourages the team to create larger–not smaller–patches, that are inevitably harder to integrate. Between Git and SVN, I would say SVN is the one with the bottleneck.
Having at varying times had primary responsibility for the maintenance of cvs, svn, hg and git for my teams, I don’t think I agree about what constitutes VCS nirvana.
For my teams, unlimited repo size wasn’t especially important. We maxed out around 200GB anyway, and all but CVS were fine with that.
We thought we wanted permissions like this in some instances, and the maintenance was almost always more pain than it was worth for smallish teams. At our size, repo-level permissions were more appropriate, except during our monorepo years where it just wouldn’t have worked.
I think this might be the thing I miss most from our svn days.
Was never on my radar
IMO the only sane way to remove this push/pull bottleneck is to branch or fork. The ability to commit back when not up-to-date with svn caused me considerable pain once or twice, and as a consequence I consider that an anti-feature.
I consider this an anti-feature of svn.
I think our teams preferences on such tools were so strongly opinionated that we would not likely use this feature and would continue using our external tools. I really like Upsource these days and the admin burden there is so slight that it’d take a lot to persuade me not to use it.
also feels like not a proper goal for me
always felt well-addressed by the default non-sparse checkouts of git and hg
feels like overreach to me
I had the same impression that you did, reading the list. Subversion is almost perfect for them. For me/my teams, hg was probably the best of the pack. I wasn’t entirely happy to migrate to git, though it was a sensible thing to do on balance. (I now use git all the time because there was no way for me to avoid git entirely, and hg and git are just similar enough that tracking where they diverge broke my muscle memory. I lost more from that breakage than I lost from giving up hg.)
This was a good overview because it included a lot of different VCS systems some of which I’ve never even heard of (PlasticSCM?) and summarized some of the strengths of them. Quirky format, and I agree that he’s probably best to go with Subversion, but an interesting overview nonetheless. Most VCS articles these days are: “How I worked around the obvious shortcomings of git”
It’s an interesting list, I personally see similar issues, but ask for different solutions:
Unlimited Repository Size: I want the power of a DVCS that is local branching, local commits, with an option for a centralized model (a hybrid). For Open Source, the decentralized model is the best, for companies the trade off of being able to support significantly larger repositories (e.g. monorepos) at the expense of centralized infrastructure is useful. You can implement on demand fetching of data, back up commits automatically, etc.
For pure source control, repository size is usually not a concern, however, developing a full 3D game with artwork, etc requires you to interconnect art and current codebase in a way that is best solved with putting both in the same repository, easily exceeding hundreds of gigabytes.
Permissions on dir level: Yes, agreed, permissions are needed. I add: A way to remove arbitrary commits from anyones checkout is required. For Open source this is barely an issue, for large enough companies, being able to remove commits is essential as someone at some point will accidentally commit data that is legally not allowed in the repo (e.g license issues, personal identifiable data, etc).
Sparse Clones / Sparse Checkouts: I disagree here. This is a great solution, but i want a more userfriendly solution. I want a file system that virtualizes my checkout no matter the size but fetches files on demand when I need it. (see 1. = hybrid vcs approach). Git FS is a step in the right direction.
Direct update / put: Yes, very useful particularly for artist. Interestingly, i extend this to direct checkouts. Artist workflows also sometimes need to checkout specific subfolders without touching other part sof the tree as it can be very expensive : I don’t want to update the 5GB 3D model in this directory, but need an up to date version of the texture here.
I disagree. The whole idea of having the repository state bound to a branch makes thinking around source code significantly easier. Directories as a branch were a terrible idea in the sense that they are overly complicated, hard to follow and you intermix history lines within a reposotory checkout that is difficult to reason about.
The rest I don’t have much opinion on. I do want a few things myself tho
Meta-History Tracking: Git tracks your history as the engineer wants to have you see it, but it does not track the changes to the history itself besides a reflog. I like to see the ability to track changes of history in a graph like structure. Mercurial’s obsolence markers are doing these and allow extensions such as evolution to automatically find the correct rebase targets, share rebases in a more meaningful way ,and generally allow both the human and the system to think more “how” a history came to be.
Better GUIs: Both Mercurial and Git gui’s I’ve seen are centered around a programmers view into source control, but are horrible to use for non-technical people like artists or others that must sometimes use these too. A gui that focuses around the primitives that people are used to (e.g. file trees, etc) is missing and it’s too centered around the history view atm.
Automatic commit backups: I want to be able to share much simpler if I agree to it. Have my local commits automatically be pushed to a central server, if i agree to, and if someone knows the hash and the repo, they can get the commit. I want to put into slack: hey can you take a quick look at ad42ddb212 and be done with it.
Just a thought, but do you have something close to that in Git/Mercurial with push? While not fully automatic, both have the concept of pushing all refs to a remote, which someone else can see/pull/fetch via sha. That what you’re thinking of here?
(both those two also have the idea of hosting directly from your local clone, but I’m assuming you’re not talking to people on your same network).
So to summarize, this seems like the author wants source control suitable for absolutely codebases. There’s a lot more than just a VCS needed to make such codebases manageable (although proper VCS support for understanding sub-parts of the codebase would help with things like CI).
I agree that git does not in fact shine for these usecases, but I generally find it the most capable source control system for things that matter to me - understanding history, safely merging code, and exceptionally, manipulating the repo history, with a minimum of foot guns.
I’ve found cvs, and svn, to be quite poor for collaboration. I’ve found mercurial to have plenty of time sucking traps (it may be too powerful). While I don’t think git is the end of all source control, I think the reasons it’s won are because it’s the best tool for most users.
While I do agree that CVS And SVN are not as good in collaboration, I do feel that Git has way more foot guns than most other version control systems. I include here a few examples: terminology is uses the same words as other VCS but do different things (e.g. hg revert, svn revert revert a file, git revert removes a commit), it does not protect you from rewriting public history through rebsae (Mercurial has built in mechanisms to prevent this), and puts a ton of different behaviours into overloaded commands such as checkout. Git works great if you understand it’s underlying mechanics, but that’s already where Git is at fault. Why do I need to know the datastructures of my version control system? I don’t need to know if Vim or Emacs uses a rope, or how they save their swap files, I don’t need to know how Subversion databases look like or how clangs AST is internally working. I use the tool. Somehow, to navigate Git safely, most often I hear : “learn it”. So I belief in many ways, if you approach GIt as you approach most software, it’s incredible difficult and has tons of foot guns. It’s a bit like given you a nuclear reactor. It’s plenty powerful, but you also kind of need a nuclear physics degree to run it.
I use SVN at work, and
git
for my own stuff, and I never once had an issue withsvn revert
vsgit revert
orsvn checkout
vsgit checkout
. In fact, I’ve gotten into bad situations with SVN that are easy to fix ingit
(mostly related to adding files by mistake). Then again, I findgit
more pleasant to work with than SVN.Honestly I think Distributed Version Control System(DVCS) is not the correct way to move forward.
Same with email, docs, excel sheets,.. to your compute servers… all moving to the cloud. What Github and Gitlab is proving is a well oiled, well managed Centralized Version Control System is much more desirable:
It’s 2021, everybody who code would do it with an internet connection. Stackoverflow, google, hackernews, reddit, lobste.rs etc… A connected VCS UX is much more desirable.
The moment we recognize this fact would be would be much better off building something for an online experience. And that comes with a lot of assumption that you can make about storage, scalability, distributed, latency etc…
I personally am keeping a close watch on https://github.com/facebookexperimental/eden/ as its well built based on that philosophy. I think this is the most advance, best invested Open Source VCS solution we have to date.
I’ve used CVS, SVN and
git
. Of the three, I findgit
the easiest to set up a new repo. It’s justgit init
. With CVS, it was more work involved (both ends) and as a result, I only ever had two personal projects in CVS. I recall setting up SVN for my own use required even more insane setup and never bothered with it after getting it installed.And not everybody is comfortable with The Cloud(TM).
Using CVS and SVN as the basis of comparison wont do Centralized VCS justice.
In my mind, it would be something closer to GitPods or Github Codespaces, where you get a cloud instance provisioned with all the needed dependencies for your development plus an IDE server. You can either connect to that IDE server using an IDE client (web or actual IDE).
The idea is that IDE, VCS, CI, CD, Monitoring, Alerts, Logs should/could all be well-oiled integrated when they are built together. From a user perspective, it should only be the IDE frontend which they interact with, not Git nor CVS nor Mercurial nor SVN.
Interesting. Eden is essentially the surrender to stay compatible with Mercurial.
I also keep an eye on Pijul and wonder how well it will scale.
Pijul is too … theoretical at it current state. Same with https://github.com/martinvonz/jj
For a VCS to scale, you need a server hosting solution that integrate with different component that would enable scaling:
You also want a more mature client solution, with easy to learn UX while having knobs that let power users excel.
Finally you need a migration path for existing code repository to move to this new solution. I.e. converting from git/svn/mercurial to my-ideal-vcs is a must have.
I suppose I’ll chip in the obligatory mention of Fossil. While it isn’t perfect, it can be extended to accommodate most of the author’s criteria, as long as one is willing to write their own custom extensions. For large files, a different mindset is needed; plain git must be substituted with git-annex.
Maybe I’m misunderstanding the author, but something feels off about the “Push/pull bottleneck”. If you have conflicts with what is upstream, you must resolve them, regardless of what VCS you’re on. Comparing my experience with Git and SVN here, I much prefer Git; it has
git fetch
. So I’m able to easily see incoming conflicts without immediately incorporating them into my workdir.As far as I know, SVN gives me
checkout
, which will force me to resolve conflicts right then and there. My experience is that this encourages the team to create larger–not smaller–patches, that are inevitably harder to integrate. Between Git and SVN, I would say SVN is the one with the bottleneck.It seems that Subversion nearly fulfills the goals? Only 8 and 9 are gaps to me and 8 feels not like a proper goal to me.
Having at varying times had primary responsibility for the maintenance of cvs, svn, hg and git for my teams, I don’t think I agree about what constitutes VCS nirvana.
For my teams, unlimited repo size wasn’t especially important. We maxed out around 200GB anyway, and all but CVS were fine with that.
We thought we wanted permissions like this in some instances, and the maintenance was almost always more pain than it was worth for smallish teams. At our size, repo-level permissions were more appropriate, except during our monorepo years where it just wouldn’t have worked.
I think this might be the thing I miss most from our svn days.
Was never on my radar
IMO the only sane way to remove this push/pull bottleneck is to branch or fork. The ability to commit back when not up-to-date with svn caused me considerable pain once or twice, and as a consequence I consider that an anti-feature.
I consider this an anti-feature of svn.
I think our teams preferences on such tools were so strongly opinionated that we would not likely use this feature and would continue using our external tools. I really like Upsource these days and the admin burden there is so slight that it’d take a lot to persuade me not to use it.
also feels like not a proper goal for me
always felt well-addressed by the default non-sparse checkouts of git and hg
feels like overreach to me
I had the same impression that you did, reading the list. Subversion is almost perfect for them. For me/my teams, hg was probably the best of the pack. I wasn’t entirely happy to migrate to git, though it was a sensible thing to do on balance. (I now use git all the time because there was no way for me to avoid git entirely, and hg and git are just similar enough that tracking where they diverge broke my muscle memory. I lost more from that breakage than I lost from giving up hg.)
This was a good overview because it included a lot of different VCS systems some of which I’ve never even heard of (PlasticSCM?) and summarized some of the strengths of them. Quirky format, and I agree that he’s probably best to go with Subversion, but an interesting overview nonetheless. Most VCS articles these days are: “How I worked around the obvious shortcomings of git”
It’s an interesting list, I personally see similar issues, but ask for different solutions:
The rest I don’t have much opinion on. I do want a few things myself tho
Just a thought, but do you have something close to that in Git/Mercurial with
push
? While not fully automatic, both have the concept ofpush
ing all refs to a remote, which someone else can see/pull/fetch via sha. That what you’re thinking of here?(both those two also have the idea of hosting directly from your local clone, but I’m assuming you’re not talking to people on your same network).