This seems like a whole lot of work in abstraction for little to no practical benefit. Granted, it got turned into a library, but how did this library benefit from this abstraction other than by increasing its size?
All this abstraction gives you nothing beyond the obvious facts that diffs can (sometimes) be applied one after the other and that they can be reversed. Knowing that a merge is a pushout brings you no closer to actually performing a merge, which is the central problem that a VCS must solve. This version of the theory also skirts the interesting question of when patches commute (i.e. no merge conflicts), although I suppose you could draw more abstract commutative diagrams to define commutation, without actually giving a method to determine when this happens.
If you want to exercise your neurons, a much more practical use of your time would be to understand diff algorithms (e.g. Myers, patience) and 3-way merge algorithms (e.g. suremerge). Understanding and improving those two can result in lovely results such as a semantic diff/merge that works at the syntax level instead of the line-by-line level.
Traditionally the value of these abstractions is that they enable code reuse. Maybe there’s an existing library that offers some Category-based interface that was originally written for e.g. topology that could do something useful when called with this “patch category”?
The post struck me as more of a “here’s a workflow for workaday category theory” in the same vein as, say, “deploy your first rails app with docker” rather than a novel abstraction. If the former, I’m really happy to see more stuff like this– using real mathematics to solve small problems can be scary! The more resources, like this, that can be used to seduce developers into using well-defined (or, for that matter, defined at all) formalisms, the better.
I grant that this gave me a very good example for pushouts, which in turn makes me feel like I really know what they are. I always thought of pullbacks in terms of differential geometry, though, because that’s where I first encountered them (maths degree).
What I wanted to get across, though, is that all of this category theory, at least in this case, does not seem to be something that hackers need to know to any degree of detail any more than they need to know quantum mechanics. Sure, it’s fun to know, but it doesn’t seem to translate into a better way to write code.
Or maybe a Haskeller would love to prove me wrong.
I disagree!¹ And if you’ll bare with me through some particularly turgid imagery, I may be able to explain why.
If you get nothing else from this, I want to emphasize that the value of knowledge is not always in the direct application of that knowledge. In fact, the following statement alone makes this worthwhile as a pedagogical tool.
I grant that this gave me a very good example for pushouts, which in turn makes me feel like I really know what they are.
In particular, it’s nice to have a good reassurance that, while the terminology may still be foreign, it’s not the arduous, tooth-and-nail clamber up the rain-slicked walls of a massive, throbbing ivory tower that category and computability theory often seem like to an outside observer. In the same way that, in general, a painter does not need to know the finer points of difference between the tube colors quinacridone magenta and alazarin crimson, knowing the information helps expand the repertoire of tools they can use to effectively solve a problem. And, if you’ll permit me to be annoying for a moment:
[…] all of this category theory, […] does not seem to be something that hackers need to know to any degree of detail any more than they need to know quantum mechanics.
There is a surprisingly close relation between theoretical physics (such that is used in quantum mechanics) and the mighty buzzphrase Deep Learning. It’s not an immediate, blinding revelation that will help spawn the CMS you have to write, no. It is, however, a useful thing to know and keep in the back of your head until and when the time is appropriate.
Too, I’m dubious on the idea that this particular nugget of category theory isn’t useful in and of itself. Consider a distributed system where you can’t afford, don’t want, or don’t need a full-blown CRDT– an 80% solution where you drop the commutativity requirement may work just as well for your data type– this patch specification is dead simple to implement and still well behaved. Where I could spend months trying to twist a my data model to fit the ideas of a CvRDT, I could instead toss this simple (almost simplistic) and well behaved idea at the problem and (maybe) patch it up with assuming that TCP is a reliable-enough transport. These are real bang-for-your-buck tradeoffs that are the lifeblood of engineering as a discipline, if not a height-weight proportional hack with a charming and bubbly personalty.
My category theory is bogo, but a common problem in merge resolution is the following. Alice adds a line. Bob adds a line. (The same line.) Sometimes the correct resolution is to add that line. Sometimes the correct resolution is to add both lines. Sometimes it’s to add a single different line. How does the theory handle this?
it doesn’t; i guess this is the point jordi is making above (about exploring the actual patch-resolution mechanisms), this detail is lost in the bit about the author making it a monoid but not defining how patches should compose. which seems like a crazy hack to me.