1. 16
  1. 7

    Stage 4: Composability is destroyed at program boundaries, therefore extend these boundaries outward, until all the computational resources of civilization are joined in a single planetary-scale computer… this is the idea of Unison and unison.cloud.

    I feel like a couple reasoning steps were skipped between the assertion and the conclusion on this stage.

    1. 4

      That’s because this could be construed as clever marketing/advertising. :)

    2. 5

      For a long time I felt similarly with respect to Haskell being a poor fit to domains that require a lot of “io plumbing”. Certainly you lose a lot of the pleasantness given to you by an expressive static type system when most of your application logic consists of handling third-party input and performing simple transformations on said input.

      That said languages like Haskell make it easy to write good abstractions, around ugliness like that. If done right, it can be -very- easy to extend the input handling functionality without making it feel like you’re playing the “read who knows what and get it into the form of a application specific ADT, handling errors cleanly” game.

      Just a couple of days ago I added a feature to a Haskell project I hadn’t touched in years that allowed it to parse a completely new type of binary message over the wire. It was as easy as adding a value constructor to an ADT and updating some pattern matches that the compiler could detect were no longer exhaustive.

      1. 4

        The reason I’ll give is that Haskell’s otherwise excellent composability is destroyed at I/O boundaries, just like every other language.

        Where we see Haskell (or more generally, typed FP) excel is for programs that have minimal surface area in contact with the outside world, but with a large amount of interesting computation happening internally.

        This is a very good point. Most type system researchers seem to work under the wrong assumption that the only thing worth statically verifying is the memory-safety of a single running process[0]. This is why we get ridiculous “expressiveness improvements” like type-level Peano arithmetic, HLists and the like, that allow you to enforce ever more sophisticated invariants - so long as your data structures don’t cross process boundaries.

        IMO, further research in this direction only offers diminishing returns. Far more promising results would be obtained (and probably for less effort) by using type technology to enforce that multiple processes, possibly running on different machines, exchange data safely. For example, as someone who plumbs data back and forth between database servers and GUI clients (read: writes CRUD apps) for a living, I would be delighted with a type system that checks whether the data model in my application code agrees with the data model in the database server. This is probably an utter triviality from the point of view of pure type system research; however, it would offer bigger productivity gains than yet another statically verified implementation of self-balancing search trees or whatever.

        [0] One notable exception is type providers in F# and Idris. Kudos to their designers!

        1. 9

          This is why we get ridiculous “expressiveness improvements” like type-level Peano arithmetic, HLists and the like, that allow you to enforce ever more sophisticated invariants - so long as your data structures don’t cross process boundaries.

          Far more promising results would be obtained (and probably for less effort)

          however, it would offer bigger productivity gains than yet another statically verified implementation of self-balancing search trees or whatever.

          I don’t know why you are bringing up type system expressiveness as if it is diametrically opposed to crossing process boundaries. Being able to enforce these properties on a single process is a prerequisite for being able to enforce them on multiple processes.

          type technology to enforce that multiple processes, possibly running on different machines, exchange data safely

          There is plenty of research in this area. I don’t have the time to prepare a survey, but Global abstraction-safe marshalling with hash types (ICFP 2003), its realization into a more complete PL Acute (ICFP 2005), and Proof Carrying Code (POPL 1997) are good places to start.

          1. 6

            I don’t know why you are bringing up type system expressiveness as if it is diametrically opposed to crossing process boundaries.

            It isn’t. But all type systems have a complexity budget, and GADTs, type families and data kinds might not be the wisest way to spend it. The pros (mechanical verification of finer-grained specifications) have to be weighed against the cons (loss of global type inference, poor interaction with modularity).

            Acute (ICFP 2005)

            From a cursory read of the paper, it seems that Acute’s type system verifies the communication between Acute programs. However, in the scenario that I’m most interested in, it isn’t reasonable to rewrite an existing database server in a new programming language. I was expecting something more like a type checker that can connect to a database and verify that the client application’s data model agrees with the database schema.

            Proof Carrying Code (POPL 1997)

            From a cursory read of the paper, as well as what I remember from chapter 5 of ATTAPL, PCC is primarily geared towards verifying safety properties of low-level code. It seems like it would be useful primarily to systems programmers, not application programmers like me who take low-level safety for granted.

            1. 7

              It isn’t. But all type systems have a complexity budget, and GADTs, type families and data kinds might not be the wisest way to spend it. The pros (mechanical verification of finer-grained specifications) have to be weighed against the cons (loss of global type inference, poor interaction with modularity).

              I agree that one must balance expressiveness with type inference, but I think this is a different claim than you originally made. I’ll note that for GADTs at least, great progress is being made on type inference. My current research involves using a language with sum kinds and polymorphic recursion and I’ve been looking into making use of that GADT result.

              From a cursory read of the paper,

              Indeed, both Acute and PCC require buy-in on the PL side. You may be interested in the literature on contract systems, or the literature on gradual typing. Both of these subjects consider the case where you are in a mixed typed/untyped scenario, although neither consider distributed computing explicitly.

              For databases in particular the literature on record systems might be interesting, since a major focus is being able to precisely model relations in the type system. There’s HList which you (rightly I think) denigrated but outside of the Haskell world there is good work being done on record systems. [Ur/Web](www.impredicative.com/ur/) in particular is a standout, although that language has limited type inference.

              1. 7

                I always have a hard time precisely articulating my real objections to GADTs and type families (which don’t apply to polymorphic recursion) but I’ll give it a try once again. It isn’t just that they complicate type inference.

                What I like the most about sums and products (rather than, say, unions and intersections as in Typed Racket and Ceylon) as datatype formers is that they are isomorphism-invariant, i.e., they avoid coupling the essence of what my program does from the particular encoding I choose for the data. As far as I can tell, reference cells are the only primitive type constructor in ML and Haskell that breaks isomorphism invariance, and that’s precisely what makes mutation a “hard effect” (unlike, say, exceptions and other algebraic effects, which are “soft”, i.e., syntactic sugar for pure computations).

                GADTs and type families emphasize physical type equality and break isomorphism invariance in exactly the same way as unions, intersections and reference cells: given a type function T, and two isomorphic types Foo and Bar, you can’t tell whether T Foo and T Bar are isomorphic. This has several unfortunate consequences, of which the most damning IMO is that GADTs and type families don’t play nicely with abstract types. The former require “yes/no” answers to type equality questions. The latter require the ability to answer such questions with either “yes” or “I can neither confirm nor deny”. For example, consider the following OCaml program:

                module S = Set.Make(String)
                
                type _ foo =
                  | Int : int -> int foo
                  | Str : string -> string foo
                
                type _ bar =
                  | One : string -> string bar
                  | Many : S.t -> S.t bar
                
                let test_foo : int foo -> string = function
                  | Int _ -> "success"
                
                let test_bar : S.t bar -> string = function
                  | Many _ -> "success"
                

                The type checker correctly determines that test_foo’s match is exhaustive, because the types int and string are known to be different. On the other hand, the type checker can’t correctly determine that test_bar’s match is exhaustive, because S.t is an abstract type whose underlying representation might well be string. (We happen to know that this is not the case. But this requires global reasoning, and requiring whole-program analysis for type-checking purposes doesn’t scale.) This is a failure of abstraction.

                1. 3

                  Your thoughts mirror my own although I would lay the blame on being able to answer type disjointness questions rather than on GADTs. GADTs sans type disjointness arise naturally in any language with higher kinds via an encoding of Leibniz equality. Haskell’s issue then is that it tries to get too much mileage out of the kind–there are other kinds that have a meaningful notion of disjointness and play nicely with inference, such as sum kinds, integer kinds, name kinds, etc. bar then can be indexed by an integer kind representing length, or by the sum kind () + () with left as One and right as Many.

                  FWIW I would argue that test_foo should also flag an error. Type checking under the context Γ, int : ⋆, string : ⋆ doesn’t imply that int and string are disjoint types, for all we know integers could be implemented as strings or vice versa. Of course we know that they are disjoint because they are hard-coded into the OCaml semantics but from a type-theoretic perspective this is kind of a wart.

                  If this kind of thing interests you, you may be interested in what I’m currently working on. If you are I can give you a ping in a few months.

                  1. 2

                    I would lay the blame on being able to answer type disjointness questions rather than on GADTs.

                    Yep, exactly! If GADTs could be restricted to only case-analyze things that are meant to be case-analyzable (like booleans, integers, etc.), that would be perfectly fine with me. It’s case-analyzing the universe of types, which isn’t meant to be case-analyzable, that causes semantic trouble.

                    FWIW I would argue that test_foo should also flag an error.

                    Indeed. Depending on integers and strings being disjoint types is evil, for exactly the same reasons as mentioned in my previous comment.

                    If this kind of thing interests you, you may be interested in what I’m currently working on. If you are I can give you a ping in a few months.

                    I’m interested. Sign me up for updates!

          2. 3

            For example, as someone who plumbs data back and forth between database servers and GUI clients (read: writes CRUD apps) for a living, I would be delighted with a type system that checks whether the data model in my application code agrees with the data model in the database server.

            Ur/Web is an interesting experiment in that direction.

            1. 4

              Yep, exactly! That’s the direction I want things to go. However, I’m of the idea that the database must be the ultimate source of truth, and client applications must adjust themselves to it. That’s why I asked for a type-checker that connects itself to the database and validates client application code against the real database schema. Ur/Web takes the opposite view: it generates DDL scripts so that you can adjust your database schema to your application’s data model.

              1. 4

                a type-checker that connects itself to the database and validates client application code against the real database schema

                Is this the kind of thing that type providers in F# are for?

                1. 3

                  If I understand correctly, type providers work differently from (and quite possibly better than) what I literally asked for. Rather than validating that your application’s data model agrees with the database schema, a type provider “imports” the database schema as a collection of types that your application can use, so that you never need to manually define the data model in your code in the first place.

                  1. 3

                    I think of type providers as just a simpler way to define your data structure and parsing code. E.g instead of defining a type and parsing code for some json, just give the compiler the json itself and it’ll do it for you. But if at run time the json comes in the wrong shape…

            2. 5

              I love Haskell, but it’s a difficult language to choose for any practical problems. I don’t think it’s particularly bad at the boundary; I’d love to have Aeson in my daily work. Instead, the most pressing problem is simply that for most other programmers, it represents something so different from the “allocate and munge” paradigm that they’re used to that they categorically reject it. That matters! I don’t want to be the only person who can work on anything, and asking people to understand types and mathematical reasoning is not feasible in the context of my experiences with the median programmer.

              1. 3

                That matters! I don’t want to be the only person who can work on anything, and asking people to understand types and mathematical reasoning is not feasible in the context of my experiences with the median programmer.

                I don’t know how you can work on non-trivial programming problems without being able to understand mathematical reasoning and types. Maybe you mean: “the Haskell model of programming” which is not anywhere close to being identical with “mathematical reasoning”.

                1. 3

                  No, you misunderstand me.

                  Most programmers learn to reason about things in a very step-by-step manner. “I do this, and then this, and then this.” They think in mutations. That’s the classic “procedural” style, and it’s certainly how I learned. Many people are not comfortable moving past that. Thinking mathematically, to my mind, means that you can identify abstract traits that pertain to classes of data: understanding why those traits are shared, and how to extend those shared traits to new classes. Interfaces, generics, and (yes) Haskell typeclasses are good examples.

                  Most developers I’ve interacted with have a hard time really understanding these concepts and putting them into use (or even understanding why they are useful and valuable).

                  1. 2

                    I don’t see the connection between “procedural” versus “functional” and the ability to identify abstract traits or to think mathematically. Certainly Knuth’s Art of Programming which is illustrated by examples using a particularly ugly procedural assembly language is full of examples of abstract mathematical thinking. And some of the Haskell examples I have been pointed to here are pretty damn pedestrian. Functional programming languages may have virtues, but people have been thinking mathematically about programming problems and finding and exploiting abstract properties for a long time. That’s not some innovation of functional programming or something that automatically follows from using a functional programming language.

                    1. 2

                      Are you intentionally misunderstanding me?

                      It is certainly possible to reason about these things in imperative languages. I have not claimed that it isn’t. However, that is not part of the learning process of the “median programmer.” The distinction between functional and procedural languages is that more abstractions are idiomatic in functional languages, and effectively using Haskell and its descendants is pretty much impossible without making the effort to learn how to think about what an abstraction means. The same cannot be said of most imperative languages.

                      1. 2

                        Not intentionally. I think I get it now. You are saying that the mathematical sophistication required to use functional languages is beyond most programmers. That seems reasonable. My point is that good programming requires a certain level of mathematical sophistication or at least ability to think abstractly independently of the programming language and I’m dubious that the average Haskell programmer is significantly better on those things than, say, the average VHDL programmer.

                        1. 2

                          Precisely. That informs my choice of language because, while I feel comfortable with and would like to use Haskell in some situations, I am also aware that many programmers who may be called upon to extend or modify my code will not be. Even in languages like Java, I’ve seen programmers struggle with simple uses of generics and interfaces. The difference is that in Java, you can essentially create a pattern that makes sense and say “write your procedural, data-munging code in this function,” whereas in Haskell it’s a lot harder to write a sophisticated shell around a creamy filling of 1973-style code.

                          Of course, in Java, it’s also a lot easier to have your abstractions utterly destroyed because someone decided that they’d just write the networking layer in a method on the CreditCardPayment class.

                2. [Comment removed by author]

                  1. 2

                    My experience is that the median programmer can be successful in continuing to evolve an application in the context of an established framework: adding features or tweaking existing ones, fixing bugs that are pointed out for them, and so on. The difficulty comes in understanding the logic behind functional or modular decomposition, or even well-factored object orientation.