I found this page better than the main docs page for finding out what the language is, because most of the type doesn’t actually explain what Unison is. The idea is very cool though I’m not 100% sure I understand it correctly (why couldn’t it be implemented on top of an existing language?). Will be interesting to see where it goes.
Mostly off-topic (sorry); does anyone else find the site hard to read/wrap one’s head around? The design and content feel claustrophobic. My screen is 1920 pixels wide, why is all the text squeezed into a ~900px wide column?…
I saw the Strange Loop demo, and the biggest unanswered question was “how do you refactor an existing function?” This wasn’t answered and the edit docs aren’t covering that case. I think the answer is supposed to be “the tooling takes care of it”, but that sounds risky to me.
My understanding from the demo is that the refactored function would get a new hash and the name associated with the previous function’s hash would get updated. The speaker noted that the name association is stored separately, so a dependency/refactor update is trivial for this reason. Dependent functions can update that reference and that is it (assuming a true refactor of maintaining the same type signature). Although I may not be understanding some subtly of your question.
It’s the “Dependent functions can update that reference and that is it” that I’m hung up on. One of the selling points is that you can have two versions of the same function, which eliminates dependency conflicts. Consider the following case:
A -> B -> C
A' -> B -> C
I discover a bug in C’s implementation and refactor it to C'. The tooling automatically updates B, which calls C, to B', which calls C'. Do we transitively update A? What about A'? What happens when the call chain is now 20 functions deep? Case two:
A'' -> B' -> C'
A' -> B -> C
Turns out there was a second bug in C, and I have not yet pulled C'. I release C'' off C. How do we merge the change with C'? What if there are merge conflicts? Do we end up with two a fragmented ecosystem? Case three:
A -> B -> C -> D -> E -> F
C and F are in separate libraries. I see a bug in C and make C', somebody else at the same time sees a bug in F and pushes F'. What happens to A?
The speaker’s example relies on/assumes different namespaces (at 26:16), but maybe the suggestion is that if you want to maintain two different versions, then they must ultimately be named differently. So a refactor of an existing type would not actually differentiate itself as a separate version unless you name it something different.
That said, since all types are content addressable, you can still give each type a different name. It may be a matter of whether you choose to do that in your source, or you simply keep the one name and therefore the new version implicitly replaces the previous versions (similar to git, but at type level rather than file).
Do we transitively update A?
Correct, this is not answered in the talk. I can only speculate that the IR of hashes are updated to reflect the change unless you give it a new name in the textual/source representation. My guess is that if a fix to C or F is pushed, references will be implicitly updates (from the name C or F to the new hash). The Merkle tree will update accordingly. Of course if the name of C’ or F’ are changed and pushed, then the existing types will not implicitly update.
Again this is speculation, but I am enjoying the conversation.
The way update propagation works is this: first we visit immediate dependents of the old and update them to point to the new hash. This alters their hash. We repeat for their dependents, and so on…
…if the update isn’t type preserving, the ‘todo’ command walks you through a structured refactoring process where the programmer specifies how to update dependents of old hashes.
Dependency chains in codebases written by humans tend to be pretty small. If it were even 100 that would be a lot.
Once this manual process of updating reaching a “type preserving frontier”, the rest of the way can be propagated automatically.
Also, these mappings from old hash to new hash are recorded in a Unison “patch”. A patch can be applied to any Unison namespace
Important asterisk: for changes to type declarations, right now all we support is manual propagation. Which can be quite tedious. We are working on fixing this!
I’m a bit confused about this language. The core idea seems to be to take something cool—static linking—and apply it pervasively, cutting down notions of interface boundaries and ensuring that once you depend on something, regardless of whether you’re a whole program or module or individual function, you always get that thing, forever. Great.
Except static linking isn’t cool, and this idea raises it from an annoyance to to the worst nightmare I can imagine. Static linking (and containers and full-stack installers) killed ABI-level abstraction: a possibly-transitive dependency changes an implementation detail, and you have to relink everything. Content-addressed references kill API-level abstraction too: a dependency somewhere changes an implementation detail, and your source code needs to change to reference it. Whether it’s a newly-discovered speed optimization or a third-party API change, every detail now demands a change in everything downstream of it. And if the tools make this find-and-replace operation as easy as it’ll need to be, it’s hard to see the point in the concept in the first place.
I really hope I’ve fundamentally misunderstood this.
it’s interesting that you say this, because most code written today s an unmediated collection of static and dynamic entities in an environment without a concrete opinion. By default, we’ve ended up in a place where some code is static by convention (e.g., glibc; unmaintained code; legacy historical code), and some code is highly dynamic (e.g. leftpad), but this area has been absent of mechanisms and processes.
It’s possible that this is the wrong mechanism and the wrong process, but let’s love the idea together for a moment that all code is absolutely immutably static.
Some of the guarantees that gives us are kind of interesting, in the same way that immutable data gives us some interesting qualities (e.g., GC gets real easy, spooky action at a distance is far more constrained, refactoring gets less scary). For example, you could imagine that a gigantic, internet-wide, parallelized perfect glibc could emergently form, because it’s possible that everyone is now building on everyone else’s fundamental algorithm work, rather than each person starting again from scratch. Obviously, deployment becomes trivial, just as it does today when you explicitly specify dependency versions.
There are some serious and interesting problems – like, how do you make sure an upgrade is an upgrade, etc. But even some ‘difficult’ problems like forced upgrades for security measures are orthogonal to this basic idea and can be implemented as tooling.
I’m not sure how relevant it is that “all code is absolutely immutably static”.
The important point about the status quo is that we have symbolic references at our disposal. Whether it’s glibc adding a better allocator or left-pad discovering a faster way to pad a string, I shouldn’t need to care that these things are being modified all the time. And if I don’t care about left-pad getting faster under me, I care even less if a dependency a mile down the tree depends on left-pad too. You can name all sorts of problems with this idea in practice, and I won’t argue with you: the reality is kinda terrible. But despite all the terribleness, it works well enough often enough to hold considerable value—I can usually update dependencies without changing my source code at all. If source code contains all of its dependencies, this no longer makes sense.
For example, you could imagine that a gigantic, internet-wide, parallelized perfect glibc could emergently form, because it’s possible that everyone is now building on everyone else’s fundamental algorithm work, rather than each person starting again from scratch.
I’m not really sure what you’re getting at here. Could you elaborate?
There are some serious and interesting problems – like, how do you make sure an upgrade is an upgrade, etc. But even some ‘difficult’ problems like forced upgrades for security measures are orthogonal to this basic idea and can be implemented as tooling.
I absolutely disagree with this. The defining property of Unison removes the layer of indirection between depender and dependee. Upgrading one’s dependencies can, by design, be achieved only by modifying the source code. That’s not orthogonal to the basic idea—it’s as direct a consequence as you can get.
And if you make this easy by hiding it with tooling somehow, code starts not to look so absolutely immutably static after all.
you absolutely do turn out to care that, e.g., leftpad is being modified all the time, especially when it’s out from under you. In practice many changes, even well intentioned “upgrade to performance only” changes, cause the sum of your program to stop working. What if it didn’t have to be terrible? What if we could find a way to get beyond this shitty local maxima we’re sitting on?
On the ‘internet-wide parallelized perfect glibc’ – right now, how do you contribute code? You might for example put up a pull request on github and walk through a long dance with the maintainer to get it out there. Maybe your code will be included in a future library release. In a content-addressible space, imagine that you want to upgrade sort() from bubble to merge (or something). Now, you implement that function the way you want, and you publish it, and it’s available to everyone else’s editor immediately. Suddenly there are ten thousand sort()s when you try to autocomplete, each one with different tradeoffs and optionalities and performance conditions. Now imagine that your environment (or some curator) can weight the ones that are best for your current condition up to the top, and shadow off those which aren’t good or appropriate at the time.
Over the course of a few decades, you’re going to have an emergent glibc-alike that is essentially perfect. Every variant of every algorithm and data structure will be available, and there will be clear winners optimized for various cache architectures and instruction sets.
I think you’re taking a highly optimistic look at today’s dependency management situation and a highly pessimistic look at the opportunities for intentional dependency management. Try it the other way around! I agree that this is quite possibly not the right idea yet, but if the node.js ecosystem is the pinnacle of achievement in this space then we are all thoroughly fucked.
It seems to really test some of the ideas laid out by Rich Hickey several years ago in this talk. It’s been some years since I watched the talk but I remember walking away thinking “wow, we really all do versioning of dependencies wrong”. A core argument seemed to be around this idea that once we depend on a function, it should NEVER change out from under us.
Applying immutability as a value (heh) and applying it to source code itself is really, really intriguing to me.
Here’s an introduction and demo given at Strange Loop.
I found this page better than the main docs page for finding out what the language is, because most of the type doesn’t actually explain what Unison is. The idea is very cool though I’m not 100% sure I understand it correctly (why couldn’t it be implemented on top of an existing language?). Will be interesting to see where it goes.
Mostly off-topic (sorry); does anyone else find the site hard to read/wrap one’s head around? The design and content feel claustrophobic. My screen is 1920 pixels wide, why is all the text squeezed into a ~900px wide column?…
I saw the Strange Loop demo, and the biggest unanswered question was “how do you refactor an existing function?” This wasn’t answered and the edit docs aren’t covering that case. I think the answer is supposed to be “the tooling takes care of it”, but that sounds risky to me.
My understanding from the demo is that the refactored function would get a new hash and the name associated with the previous function’s hash would get updated. The speaker noted that the name association is stored separately, so a dependency/refactor update is trivial for this reason. Dependent functions can update that reference and that is it (assuming a true refactor of maintaining the same type signature). Although I may not be understanding some subtly of your question.
It’s the “Dependent functions can update that reference and that is it” that I’m hung up on. One of the selling points is that you can have two versions of the same function, which eliminates dependency conflicts. Consider the following case:
I discover a bug in C’s implementation and refactor it to
C'
. The tooling automatically updatesB
, which callsC
, toB'
, which callsC'
. Do we transitively updateA
? What aboutA'
? What happens when the call chain is now 20 functions deep? Case two:Turns out there was a second bug in
C
, and I have not yet pulledC'
. I releaseC''
offC
. How do we merge the change withC'
? What if there are merge conflicts? Do we end up with two a fragmented ecosystem? Case three:C and F are in separate libraries. I see a bug in
C
and makeC'
, somebody else at the same time sees a bug inF
and pushesF'
. What happens toA
?Here is the applicable part of the StrangeLoop talk: https://youtu.be/gCWtkvDQ2ZI?t=1395
The speaker’s example relies on/assumes different namespaces (at 26:16), but maybe the suggestion is that if you want to maintain two different versions, then they must ultimately be named differently. So a refactor of an existing type would not actually differentiate itself as a separate version unless you name it something different.
That said, since all types are content addressable, you can still give each type a different name. It may be a matter of whether you choose to do that in your source, or you simply keep the one name and therefore the new version implicitly replaces the previous versions (similar to git, but at type level rather than file).
Correct, this is not answered in the talk. I can only speculate that the IR of hashes are updated to reflect the change unless you give it a new name in the textual/source representation. My guess is that if a fix to C or F is pushed, references will be implicitly updates (from the name C or F to the new hash). The Merkle tree will update accordingly. Of course if the name of C’ or F’ are changed and pushed, then the existing types will not implicitly update.
Again this is speculation, but I am enjoying the conversation.
Some details about propagation: https://twitter.com/unisonweb/status/1173942969726054401
I’m a bit confused about this language. The core idea seems to be to take something cool—static linking—and apply it pervasively, cutting down notions of interface boundaries and ensuring that once you depend on something, regardless of whether you’re a whole program or module or individual function, you always get that thing, forever. Great.
Except static linking isn’t cool, and this idea raises it from an annoyance to to the worst nightmare I can imagine. Static linking (and containers and full-stack installers) killed ABI-level abstraction: a possibly-transitive dependency changes an implementation detail, and you have to relink everything. Content-addressed references kill API-level abstraction too: a dependency somewhere changes an implementation detail, and your source code needs to change to reference it. Whether it’s a newly-discovered speed optimization or a third-party API change, every detail now demands a change in everything downstream of it. And if the tools make this find-and-replace operation as easy as it’ll need to be, it’s hard to see the point in the concept in the first place.
I really hope I’ve fundamentally misunderstood this.
it’s interesting that you say this, because most code written today s an unmediated collection of static and dynamic entities in an environment without a concrete opinion. By default, we’ve ended up in a place where some code is static by convention (e.g., glibc; unmaintained code; legacy historical code), and some code is highly dynamic (e.g. leftpad), but this area has been absent of mechanisms and processes.
It’s possible that this is the wrong mechanism and the wrong process, but let’s love the idea together for a moment that all code is absolutely immutably static.
Some of the guarantees that gives us are kind of interesting, in the same way that immutable data gives us some interesting qualities (e.g., GC gets real easy, spooky action at a distance is far more constrained, refactoring gets less scary). For example, you could imagine that a gigantic, internet-wide, parallelized perfect glibc could emergently form, because it’s possible that everyone is now building on everyone else’s fundamental algorithm work, rather than each person starting again from scratch. Obviously, deployment becomes trivial, just as it does today when you explicitly specify dependency versions.
There are some serious and interesting problems – like, how do you make sure an upgrade is an upgrade, etc. But even some ‘difficult’ problems like forced upgrades for security measures are orthogonal to this basic idea and can be implemented as tooling.
Overall an interesting direction.
I’m not sure how relevant it is that “all code is absolutely immutably static”.
The important point about the status quo is that we have symbolic references at our disposal. Whether it’s glibc adding a better allocator or left-pad discovering a faster way to pad a string, I shouldn’t need to care that these things are being modified all the time. And if I don’t care about left-pad getting faster under me, I care even less if a dependency a mile down the tree depends on left-pad too. You can name all sorts of problems with this idea in practice, and I won’t argue with you: the reality is kinda terrible. But despite all the terribleness, it works well enough often enough to hold considerable value—I can usually update dependencies without changing my source code at all. If source code contains all of its dependencies, this no longer makes sense.
I’m not really sure what you’re getting at here. Could you elaborate?
I absolutely disagree with this. The defining property of Unison removes the layer of indirection between depender and dependee. Upgrading one’s dependencies can, by design, be achieved only by modifying the source code. That’s not orthogonal to the basic idea—it’s as direct a consequence as you can get.
And if you make this easy by hiding it with tooling somehow, code starts not to look so absolutely immutably static after all.
you absolutely do turn out to care that, e.g., leftpad is being modified all the time, especially when it’s out from under you. In practice many changes, even well intentioned “upgrade to performance only” changes, cause the sum of your program to stop working. What if it didn’t have to be terrible? What if we could find a way to get beyond this shitty local maxima we’re sitting on?
On the ‘internet-wide parallelized perfect glibc’ – right now, how do you contribute code? You might for example put up a pull request on github and walk through a long dance with the maintainer to get it out there. Maybe your code will be included in a future library release. In a content-addressible space, imagine that you want to upgrade sort() from bubble to merge (or something). Now, you implement that function the way you want, and you publish it, and it’s available to everyone else’s editor immediately. Suddenly there are ten thousand sort()s when you try to autocomplete, each one with different tradeoffs and optionalities and performance conditions. Now imagine that your environment (or some curator) can weight the ones that are best for your current condition up to the top, and shadow off those which aren’t good or appropriate at the time.
Over the course of a few decades, you’re going to have an emergent glibc-alike that is essentially perfect. Every variant of every algorithm and data structure will be available, and there will be clear winners optimized for various cache architectures and instruction sets.
I think you’re taking a highly optimistic look at today’s dependency management situation and a highly pessimistic look at the opportunities for intentional dependency management. Try it the other way around! I agree that this is quite possibly not the right idea yet, but if the node.js ecosystem is the pinnacle of achievement in this space then we are all thoroughly fucked.
This is a really fascinating concept.
It seems to really test some of the ideas laid out by Rich Hickey several years ago in this talk. It’s been some years since I watched the talk but I remember walking away thinking “wow, we really all do versioning of dependencies wrong”. A core argument seemed to be around this idea that once we depend on a function, it should NEVER change out from under us.
Applying immutability as a value (heh) and applying it to source code itself is really, really intriguing to me.
I was mildly surprised there doesn’t seem to be a blockchain involved anywhere.