1. 25

  2. 8

    “Unison is a language in which programs are not text. That is, the source of truth for a program is not its textual representation as source code, but its structured representation as an abstract syntax tree.”

    I keep debating this idea and it seems to meet with lots of resistance. I love it, because it means we can have non-text-based editors and multiple text-based syntaxes, meaning you can write in (within reason) a style you enjoy.

    You might write this code:

    function sum(numbers: enumerable of integer) returns integer =>
        |> Enum.reduce(0) { |acc, n| acc + n } )

    But when someone else goes to edit it, they see this.

    def sum(integer[] numbers) : integer {
        reduce(numbers, 0, lambda acc, n => { acc + n } )

    And when you come back to it - it looks exactly as you wrote it.

    We’re already part way there with some languages having an agreed-upon style enforcing tool, e.g. ReSharper, IDEA-based editors, mix format for Elixir…

    Here, though, we’re doing the opposite - saying style is up to you and it won’t affect anyone else.

    Here’s another way someone might see - and edit - the code above. Similar in terms of syntax but different layout.

    def sum( integer[] numbers )
      : integer
            λ acc, n => {
               acc + n

    This doesn’t change the higher level code structure, of course. If you implement the function using reduce(), it’s reduce() for everyone.

    What, though, about diffs? Don’t we break them?

    Actually no, I think we make them better. No more trying to see functional changes amidst a sea of formatting differences applied by a developer or an IDE/editor. I know people ‘shouldn’t’ make commits which combine both, but the reality is that they do.

    An AST based storage system would mean that you’d never have formatting changes - and - crucially - Diffs will be in the style you prefer!

    Just one more thing: You can indent with tabs, spaces, anything you like!

    1. 2

      I may be missing some important pieces, but this looks like the coolest tech story I’ve read in years.

      Each coder could have their own code annotations if they wanted. Combine that with including the deployment environment inside the code? With the Erlang roots, you could write code and have it run any place from javascript in an html page to a massively-distributed cloud deployment, all either without changing anything (and having it scale automatically, presumably) or changing very little.

      The cross-language thing is also amazing. The biggest downside I can see is that it would be possible to look at code and think it should work when it doesn’t. But hell, that’s what we have already today. With this setup, if you plug part A into part B, it’s always going to work the same way. There would be no such thing as CI/CD/etc pipelines and testing. There’d be no need for them.

      You could work on a team internationally where each person had the code in their own native tongue and using the programming language of their choice, and it’s all integrated by default.


      1. 4

        I agree in that the idea has a lot of potential. I’ve known people who tried to implement some of this ideas on top of existing languages (e.g. “what if we could save Ruby code in AST format and then display it with syntax of choice”).

        Another application I can think of this is localizing code to be less English-centric. It sucks that in order to learn to program, you also need to learn a foreign language, or at least a subset of it. It makes programming less accessible to a lot of people.

        But I think that for this idea of coding with ASTs instead of text to have any substantial success, it would need such a level of adoption across all sort of media that it is practically infeasible. Because today source code is text, so it’s “compatible” with all media that accepts text. But if source code would be the AST that would not be the case.

        Of course we could imagine our desktop environments and code editors knowing that source files for X programming language should not be displayed as text, but first converted to text using the user’s syntactic preferences and then displayed. And we could imagine teaching other tools such as git diff, less to do the same, or maybe implement replacements for those. But, could we also expect that messaging systems that allow for code snippets like reduce (+) 0 numbers to implement that logic such that I could see that snippet in a more familiar syntax here in Lobsters? Or what about code in articles, blog posts, ebooks, or even printed in physical books or on videos? Or what about programming classes, which syntax should they use?

        I think having such level of syntactic freedom, but only on a limited range of media, would probably dilute the benefits of said syntactic freedom quite a lot. After all, code is for communication, and if we can’t communicate seamlessly, because at some point we realize we’re not “speaking the same language”, maybe it’s better to give up some of that personal syntactic freedom, in lieu of greater freedom of communication across people using a common lingua franca.

        But IDK, the idea is still interesting! :D

        1. 2

          That’s a nice identification of the issues and some suggestions.

          I wonder if we’re still just not talking about a load/save filter. The input, whether it’s from a code snippet tool or a repo, has a certain hash. Whatever tool you’re using also has a certain hash. As long as you had the symbol information handy (also a hash), shouldn’t there be some sort of lookup back and forth?

          Maybe I missed something. Apologies if I did. I’m just spitballing. The insight here, I think, is that the hash and trees sit on top of whatever we’re doing anyway, and as long as we keep track of the users metadata at the same time, we should be able treat the code separately from how the user thinks of the code.

          So at some point beyond the implementation details, this is all just a higher-level git, with matching, hashing, and so forth. The only difference is it’s at the meta level, and the underlying code must be pure FP. It fails otherwise.

          Definitely fun to think about! :)

          1. 2

            Oh, thanks for the reply. It made me realize that I wasn’t connecting all the dots. I had read about Unison before. And also had discussed the concept of saving code as ASTs instead of text. But i hadn’t fully made the connection of the later with the idea of referencing by hashes.

            And yeah, the two concepts seem to mash up together very nicely. Referencing things by hashes would definitely help in not falling into that sort of “uncanny valley” I mentioned of tools not understanding this. Because, even if the tool didn’t know how to display the code related to that hash, at least you’d have the hash to look it up :)

            And, couple that with some URI protocol that browsers or other tools might be modified to understand and display properly, like unison:<THE_HASH>, and you get a pretty universal way of referencing these things (URIs are text, so yay: text again! :D).

            Definitely fun to think about! :)

            Definitely, yes

      2. 1

        Question: I think this looks really cool, but how do you think Unison will fly in practice? You seem to have a lot more insight into this stuff than I do. I wonder if it’s just going to be a cool academic idea that never really grounds out to any adoption because it’s just a few steps too far for people (although I guess this was also true of Rust, but Rust had the full force of Mozilla and a need for Gecko to improve).

      3. 4

        The [language reference] is a good read. Syntactically, it seems very close to Haskell, but it’s not lazy by default, and it has algebraic effects called “abilities”.

        1. 1
        2. 3

          Ahh, Paul Chiusano is also an author of Functional Programming in Scala, one of the most helpful and eye-opening programming books for me. It really helped me begin to grasp some key functional programming concepts after I was kind of struggling to become productive in Scala.

          Some cool things that jump out to me about Unison:

          1. 3

            We’re going to be able to cache test results so we don’t have to keep running the same tests over and over.

            That’s a cool idea! I wonder if it’d be possible to backport this idea to e.g. Haskell. Though that said, for me all of my slow tests are ones that check against a database, so this wouldn’t help in that case. Still, I like the direction this is going in. I have strong feelings about the Blub Paradox so it’s nice to see a language that isn’t just another re-skin of C.

            1. 3

              Chisano said no one has really tried to build a programming language around this idea, though it appears OCaml stores MD5 hashes of the modules it depends on, to verify them at link time.

              Indeed! OCaml does some pretty cool things with the way it engineers modules and their dependencies. I wrote a little about this: https://dev.to/yawaramin/ocaml-interface-files-hero-or-menace-2cib

              1. 2

                Great write-up. Using MD5 may be overkill. I’d have to think on it more. The best part, aside from type-checking during linking, is that the use of interfaces makes parallel builds easy. One of biggest gripes about large software, esp in C++, is how builds take forever. Languages that give me fast builds make me much happier.