1. 2

    Are there plans for a first-class mix compiler? I see tooling commits that have been sitting at the top of the changelog for a while. I’d love to use gleam inside my elixir umbrella apps!

    1. 3

      The gleam compile-package API is perfect for use by mix etc, and it has been out for some time! Mostly we need someone to make a mix plugin that uses it. I’m focusing on the Gleam tooling and I’m not an Elixir user at present so I won’t be doing it myself any time soon.

    1. 2

      It’s interesting to see the resistance from the go community itself against moving to v2. Perhaps this is an indication that explicit major versions (after v1) in the module system were a mistake.

      1. 3

        Maybe I’m a bit crazy but I feel like if you’re programing elixir with message passing in mind, then you’re doing something wrong. 99% on your code should be purely functional and you should not be thinking about message passing (which is fundamentally stateful/effectful). Sure, it’s great to keep in mind that it’s happening behind the scenes, and that grokking the mechanism can give you confidence about the robustness of your code, but I do not architect the bulk of my code considering message passing, except if I’m digging really deep and writing e.g. a low-level tcp protocol implementation (which I almost never am).

        Is it different in the erlang community?

        1. 8

          Elixir and Erlang do not have a “pure FP” philosophy. Evaluation is eager, typing is dynamic, and side effects are basically never pushed into a monad in practice. Some of the core libraries even proudly expose global/cluster-wide state.

          The parts of FP that made it in (first-class functions, immutable data structures, certainly some other aspects I am missing) are there because they are useful for building systems that are resilient, debuggable, repairable, upgradable, and evolvable with ~zero downtime.

          That is the primary goal of Erlang, and Elixir inherits this legacy. It’s a goal, not a philosophy, so you may find competing ideas next to each other, because they’ve been shown to work well together in this context.

          1. 7

            total nitpick, but pure FP requires neither static typing nor lazyness.

            1. 2

              Also effects in FP languages can be, and usually are modeled using process calculi which are exactly what Erlang offers!

              That being said, Erlang also has side effects apart from message passing.

            2. 2

              never claimed it is pure FP. The VM is nice in that gives you places to breach FP pureness in exactly the right spots and makes it very hard or ugly to breaching pureness where doing so is dangerous.

              1. 4

                My mistake, I thought you were surprised at message passing from a pure-FP point of view.

                Another reason to think of message passing, and more broadly genservers/processes, in particular is that they can become bottlenecks if used carelessly. People talk about genservers as a building block of concurrency, which isn’t false, but from another point of view they are Erlang’s units of single-threaded computation. They only process one message at a time, and this is a feature if you know how/when to use it, but a drawback at other times. Effective Elixir or Erlang development must keep in mind the overall pattern of message passing largely to avoid this issue (and, in highly performance-sensitive cases, to avoid the overhead of message passing itself).

                1. 1

                  Love reminding people that genservers are a unit of single threaded ness.

                  1. 1

                    I still can’t find a good word for it! https://twitter.com/gamache/status/1390326847662137355

            3. 6

              I can only offer my own experience (5 years of Erlang programming professionally).

              Message passing, like with FP, is a tool that can be (mis)used. Some of the best uses of multiple processes or message passing that I see in code are:

              • Enforcing a sequence on unordered events, or enforcing the single writer principle.
              • Bounded concurrency. Worker pools, queues, and more.
              • Implementing protocols. Protocols for passing messages between processes serve to standardize and abstract. Suppose you have a service on TCP that you want to extend to work over Websockets. The well-architected solution for this has 3 kinds of processes. 1 process that receives Erlang terms, and 2 processes that receive data along some transport (TCP, Websockets, etc.), and send Erlang terms. Structuring Erlang code in this way is an amazing aid in keeping code simple and organized.

              I’ll generally come across problems that are solved by processes/message passing when writing libraries. When writing application code that uses those libraries, it’s usually far less common.

              1. 4

                my advice is typically:

                • are you writing a library? You probably don’t need a genserver (except for odd things like “I need a “fire and forget” genserver to wrap an ets table, well, yeah).
                • ok so you still think you need a genserver? did you try Task (this is elixir-specific)
                • are you wrapping a stateful communication protocol? then go ahead use genserver.
                • are you creating a ‘smart cache’ for something IRL or external to the vm? then go ahead and use genserver
                • are you creating temporary shared state between users (like a chat room?) then go head and use genserver

                I like the bounded concurrency one. Should probably add it to my list. Are you creating a rate limiter or resource pool? then use genserver.

                1. 3

                  There is nothing wrong with using gen_server in library. The thing is that in most cases it is not you who should start and manage that process - leave it up to the user. The “problem” is that there are 3 “kinds” of projects in Erlang world:

                  • “libraries” which should not start their own supervisor in general and should leave all process management up to the end user
                  • “library applications” which should be mostly self contained, and are mostly independent from the main application, for example OpenTelemetry, systemd integration, Telemetry Poller, etc.
                  • “end-user applications” your application, where you handle all the data and processes on your own

                  In each of these parts there are different needs and message passing will be used more or less, depending on the needs.

                  1. 1

                    150% this. Dave Thomas got this trichotomy right in his empex talk, it’s just infuriating to me that his choice of “names” for these things is unnecessarily confusing.

                    1. 1

                      Sorry I mistyped… It should be “are you writing a library? If not, you should probably not be writing a genserver.

                2. 2

                  In my experience (having done Elixir for 8 years), folks that don’t understand/think about messages and genservers in Elixir are at a severe disadvantage when debugging and grokking why their code is behaving some way.

                  It’s a lot like people who in the previous generation of webdevs learned Rails but not Ruby…which is fitting, since those are the folks driving adoption of Phoenix.

                  (There are also certainly people who reach for a genserver for everything, and that’s a whole additional annoying thing. Also the people who use Tasks when they really, really shouldn’t.)

                  1. 1

                    I’ve not used Elixir, but it sounds from your description as if it has some abstractions that give you Erlang (technically Beam, I guess) processes but abstract away the message-passing details? The last time I wrote a non-trivial amount of Erlang code was about 15 years ago, but at least back then Erlang didn’t have anything like this. If you wanted concurrency, you used processes and message passing. If you didn’t want concurrency, there were several functional programming languages with much faster sequential execution than Erlang, so you’d probably chosen the wrong tool.

                    1. 1

                      Processes are the core feature that provides both error isolation and fault tolerance. As you build more robust systems in Elixir, the number of times you use processes increases.

                      Often it is correct to abstract this away behind a module or API, but you’re still passing messages.

                    1. 3

                      A tangent, not to detract from the article: this reminded me that I am really interested in examples of using such things in production with positive effects.

                      I understand what monoids are and how to use them. I’ve literally never seen: “Look, we turned this into a monoid and as a result we uncovered bugs, had fewer bugs, solved another issue, made it easier to implement X, …”. If it would be in any language not Haskell that would be even more awesome.

                      I need compelling examples as nucleation points and without those I just don’t ‘get’ things. You can call it a failure of imagination if you like: that doesn’t change that actually using category theory in my coding practices isn’t something I understand how to do, but I feel is something I could learn with the right nudge.

                      1. 5

                        You’re asking for soup from a stone, in some sense; category theory doesn’t give us libraries, it gives us an understanding of innate structure.

                        To give two examples from capability theory, let us consider an object graph in a heap. An object will merely be a collection of references to other objects, allowing duplicates and cycles. Also, some objects may be the same even though they have multiple copies on the heap, and some objects may be transparent forwarders who hold a single reference and should be considered the same as their referent. The first example is that, with a bit of reflection, this is all of the data of a hypergraph with path equivalence, and hypergraphs with those equivalences are categories! Every capability object graph has an underlying category, and if we can evolve the graph over time, then this gives us a sequence of functors.

                        The second example comes when we want to imagine that we have multiple objects in multiple heaps, and we want to send messages from one heap to another. We may want an operation, traditionally called join, which can take two references, assert that they refer to the same object, and unify them into one single reference. Note that, in any heap, we should not just be able to join references, but also create fresh new objects which, by definition, are (trivially) the same as themselves. This sounds like a monoid. Also, CRDTs are commutative monoids. This leads to an application of the Eckmann–Hilton argument which suggests that we can find high-level CRDT effects within existing inter-heap effects.

                        1. 1

                          hypergraphs with those equivalences are categories

                          And so you know that you can perform certain operations on them that conserve certain properties. So that’s good to know, but now what? You can define mempty and mappend functions and get rid of a bit of boilerplate in operations on the hypergraphs, but it seems to me the interesting thing is realizing they are a category in the first place and knowing what law-conserving operations you can perform. The consequences of that on how you implement your operations seem relatively trivial to me. Actually casting it into a Monoid and implementing the mempty and mappend functions doesn’t seem to be the important/necessary thing here?

                        2. 4

                          Consider the kinds of algorithms you write where you start with a collection of items and you want to perform some mapping on each item and then combine the results. For example, you want to add one to each integer and sum them, or convert integers to strings and concatenate them, or map integers to lists of integers and then flatten the resulting list.

                          Each of these examples can be implemented using the function foldMap in Haskell. Let’s look at the signature:

                          λ> :t foldMap
                          foldMap :: (Foldable t, Monoid m) => (a -> m) -> t a -> m
                          

                          foldMap asks for a function from any type a to some type m which is a Monoid. It also asks for some Foldable of that type a (we’ll just use lists). It returns a value of type m by applying your mapping function to each element and then “combining” the results. How does it know how to combine them? The Monoid instance for each type tells it how. Let’s try it out.

                          λ> foldMap show [1, 2, 3, 4]
                          "1234"
                          

                          Here we convert an integer to a string, and then concatenate the results for produce a single string.

                          λ> foldMap (\i -> [i, i + 1]) [1, 2, 3, 4]
                          [1,2,2,3,3,4,4,5]
                          

                          Now we effectively do a “flat map”, producing a list per initial item and then concatenating all of those lists together.

                          λ> import Data.Monoid (Sum (..))             
                          λ> foldMap Sum [1, 2, 3, 4]
                          Sum {getSum = 10}
                          

                          This time, we sum up the numbers in a list. Note that we had to explicitly wrap our integers in a new type Sum. Why? Because summing of integers is only one of several valid Monoid instances for integers. Here’s another:

                          λ> import Data.Monoid (Product (..))
                          λ> foldMap Product [1, 2, 3, 4]
                          Product {getProduct = 24}
                          

                          The power of the Monoid abstraction has allowed us to implement all these seemingly different algorithms with a single function, foldMap. This also means that any new data type I create that has a valid Monoid instance will be able to take advantage of this function. I hope this helps!

                          1. 1

                            This also means that any new data type I create that has a valid Monoid instance will be able to take advantage of this function. I hope this helps!

                            Thank you for your effort, but unfortunately it doesn’t. Your comment explains the abstraction and what you can achieve with it. What I’m trying to understand is when this occurs in practice.

                            As far as I am aware I don’t regularly find myself foldmapping collections of values. I don’t find myself applying the same function to different datastructures. I don’t find myself applying different functions to the same datastructure, where those functions contain boilerplate or other abstractable details. But I may be wrong and just not recognizing it.

                            Say you’re building Lobsters, Shopify, Stripe, Gmail, … Where would you expect to encounter a use case for defining a Monoid? How does doing that make the code better?

                            1. 2

                              Say you’re building Lobsters, Shopify, Stripe, Gmail, … Where would you expect to encounter a use case for defining a Monoid? How does doing that make the code better?

                              I think this same question could justifiably be asked about patterns like map and fold. The challenge in directly answering the question is that the scope of the Monoid (or map, or fold) pattern is quite granular, while the problem being solved is stated quite broadly. Having Monoid as one of your foundational tools changes the way you construct software to solve broad problems, but it is hard to say “in designing Lobsters, you’d use a Monoid here” just as it is hard to say “in designing Lobsters, you’d use a fold here”. Being specific about use cases for a fine-grained pattern in a very broad (and possibly vague) problem space is tricky. I can’t tell you exactly where you’d use a fold in building Lobsters, but I know I wouldn’t want to built it without the capabilities fold provides.

                              1. 1

                                The difference I see is that I, and I assume most others, have a good feel for when to use them, because I use them all the time, without thinking about monoids, categories, laws, etc. You just don’t (need to) identify and turn datastructures (explicitly) into monoids to use them and the benefit of doing so seems negligible in most cases.

                                Would it be wrong to say that these abstractions are useful if you find yourself applying the same function to multiple datastructures or applying a bunch of different functions with ‘reusable’ implementation details to the same datastructure, in addition to it being useful to recognize a datastructure as forming a certain category, meaning it satisfies certain conservation laws under certain operations, but that that is all there is to it?

                          2. 3
                            • If you know your operation is monoidal, you can distribute it across multiple nodes and combine the results with their neighbours as they arrive.

                            • If you know your operation is commutative, you can distribute across multiple nodes and combine the results as soon as they come in.

                            • If you know your operation forms a group, you can recover from mistakes without restarting from scratch (by applying enough inverses that you correct your mistake).

                            You can write all of these abstractions in terms of a monoid/abelian monoid/group interface and let the caller choose the precise structure that he or she wants.

                            An extremely useful example is key-value maps. Haskell’s monoidal-containers library provides Monoid instances for maps if the values have a Monoid instance:

                            instance (Ord k, Semigroup m) => Semigroup (MonoidalMap k m) where
                              MonoidalMap p <> MonoidalMap q = MonoidalMap (Map.unionWith (<>) p q)
                            instance (Ord k, Monoid m) => Monoid (MonoidalMap k m) where
                              mempty = MonoitalMap Map.empty
                            

                            From this, we can recover a huge array of useful map-combining operations:

                            • Keep the first: MonoidalMap k (First a)
                            • Keep the last, like the left one provides defaults: MonoidalMap k (Last a)
                            • Build a table of running counts, and sum them together: MonoidalMap k (Sum Int)

                            I imagine there’d be a way of expressing this polymorphism in imperative languages.

                            1. 2

                              I feel a similar way. I’d like to apply category theory to produce better code but I don’t really get how to do it in general. Should I try to turn my abstractions into categories, and how does that result in better code? Does this question even make sense?

                              1. 1

                                There was this really awesome article about the benefits of maths that I would have linked as a response to your questions, but I couldn’t find it. The point was that maths is not so much about specific improvements of your practice, as it is about broadening your perception, so the benefits come secondary and are not directly related to your knowledge i.e. you might be already using what you learned without knowing it.

                                I have this joke that mathematics is a scam, and that it only works because people get smart while studying it.

                                1. 1

                                  Now assume someone has already studied it and has already gotten pretty smart, but doesn’t see how turning any of the datastructures they encounter in their working life into e.g. a monoid is going to help them.

                              2. 2

                                (Starting a new comment because I think the discussion will be different)

                                Another interesting about Monoids is that the product of two monoids is also a monoid. This means that doing things like producing statistics over some data set can be broken into: split the data set into chunks, compute all the statistics you want to that dataset, say, average of column A, word count across column B, the hash of the values in column C, and then you can combine those results from all chunks trivially.

                                calculateStats :: 
                                  Document [Col "A" Double, Col "B" Text, Col "C" Base64Binary] -- made up, just to give some imaginary CSV a type
                                  -> (Average, WordCount, MonoidalHash MD5)
                                

                                with Average, WordCount and MonoidalHash MD5 having an instance for Monoid themselves, then the type (Average, WordCount, MonoidalHash MD5) is also a monoid. We can now take our thousands of CSVs and compute these statistics individually across a cluster of machine, and then combine them. (I’m not sure the hash of all Base64 encoded strings in some CSVs is particularly useful, but you can do it - perhaps a bloom filter would be a better example, since they are also monoidal, by combining them using OR)

                                1. 1
                                  • If you know your operation is monoidal, you can distribute it across multiple nodes and combine the results with their neighbours as they arrive.

                                  • If you know your operation is commutative, you can distribute across multiple nodes and combine the results as soon as they come in.

                                  • If you know your operation forms a group, you can recover from mistakes without restarting from scratch (by applying enough inverses that you correct your mistake).

                                  You can write all of these abstractions in terms of a monoid/abelian monoid/group interface and let the caller choose the precise structure that he or she wants.

                                  An extremely useful example is key-value maps. Haskell’s monoidal-containers library provides Monoid instances for maps if the values have a Monoid instance:

                                  instance (Ord k, Semigroup m) => Semigroup (MonoidalMap k m) where
                                    MonoidalMap p <> MonoidalMap q = MonoidalMap (Map.unionWith (<>) p q)
                                  instance (Ord k, Monoid m) => Monoid (MonoidalMap k m) where
                                    mempty = MonoitalMap Map.empty
                                  

                                  From this, we can recover a huge array of useful map-combining operations:

                                  • Keep the first: MonoidalMap k (First a)
                                  • Keep the last, like the left one provides defaults: MonoidalMap k (Last a)
                                  • Build a table of running counts, and sum them together: MonoidalMap k (Sum Int)

                                  I imagine there’d be a way of expressing this polymorphism in imperative languages.

                                  1. 1

                                    One particularly interesting use case is parsing utf-8 text in parallel - you can take an arbitrary large byte string, split it into chunks, and parse each chunk into a monoidal type which represents whether that chunk a) contains any invalid bytes, b) whether the beginning of the text appears to split the middle of a code point, and c) whether the end of the chunk appears to split a code point. If you now have chunk parsing results A and B, the monodical operation will look at these to see:

                                    a) Are both chunks valid utf-8? If so, we can say that the combination of both chunks are also valid

                                    b) Does the end of A and the beginning of B say they thing this is a split code point? If so, can we combine the fragments and produce a valid combined parse for both chunks, otherwise we produce an invalid parse

                                    c) does either chunk say it contains an invalid parse? Then the combination is also an invalid parse (we could collect the locations of these if you wanted)

                                    Now imagine you have 20TB of what you think is utf-8 text, you can distribute arbitrary chunks among many machines, each off which can split their chunk into multiple fragments, and we can combine all the results together as long as we keep track which chunks each result comes from.

                                    This same idea can be applied too many real world applications (though some end up being extremely difficult or the state machine becomes insane; IIRC CSV is very difficult to do this with because it’s hard to tell if you’re inside a string or not). I believe JSON parson may also be amenable to this.

                                  1. 4

                                    So…. is this different from structural typing / static duck typing?

                                    1. 12

                                      Good question. I guess it adds compile-time quack-checking?

                                      1. 9

                                        structural typing

                                        You could have structurally typed records without row polymorphism. You might say that { x: int, y: float } is the same type as { y: float, x: int }, but it’s not the same type as { x: int, y: float, z: string }.

                                        That’s how SML works. You can have two different record types with some fields in common:

                                        let a = { x = 1, y = 2.0 };
                                        let b = { x = 3, y = 4.0, z = "hello" };
                                        

                                        and then you can use the same field selector on both:

                                        let ax = #x a;
                                        let bx = #x b;
                                        

                                        But you can’t take #x and package it up into a function, because the type system has no concept of “any record type with an x field”:

                                        fun getX r = #x r;  // Error: unresolved flex record (can't tell what fields there are besides #x)
                                        

                                        static duck typing

                                        Are you thinking of Go interfaces, or C++ templates? (Or something else? :D) I think either one could plausibly be called compile-time duck typing.

                                        In Go you can use interfaces to have a function accept any struct that meets some requirements. But I don’t think you could express a function like this:

                                        // addColor has type { ...fields } -> { color: string, ...fields }
                                        let addColor (c : string) r = { color = c, ...r };
                                        let bluePoint2d = addColor "blue" { x = 1, y = 2 };
                                        let redPoint3d = addColor "red" { x = 3, y = 4, z = 5 };
                                        

                                        addColor can add a “color” field to any record type, without knowing which concrete type it was given. But the caller knows which record type it’s using, so it doesn’t need to cast the result.

                                        In C++ you can write getX as a templated function:

                                        template<class T, class S>
                                        S getX(T r) { return r.x; }
                                        

                                        But now the compiler doesn’t typecheck the definition of getX: it just waits for you to use it, and typechecks each use. That’s like duck typing (just pass it in, and see if it works), but at compile time. Just like with run-time duck typing, the problem is you can’t look at the signature of getX and understand what its requirements are.

                                        1. 1

                                          Are you thinking of Go interfaces, or C++ templates?

                                          Mostly TypeScript actually :) I’m not sure where the line is because TS calls what they do “structural typing” but I’m having a hard time seeing where the differences lie.

                                          1. 1

                                            Type inference for SML code using field selectors is hilariously bad, precisely because the natural and obvious principal types (having row variables in them) are inexpressible in SML.

                                            Out of curiosity, how would you expect your addColor function to work on records that already contain a color field? The purist’s answer would be to use something like Ur/Web’s disjointness witnesses, but, gawd, the syntax is super awful.

                                            1. 2

                                              Here’s an example in purescript that will only add color to records that don’t already have a color key:

                                              module Rows where
                                              
                                              import Prelude
                                              import Effect (Effect)
                                              import Effect.Class.Console as Console
                                              import Prim.Row (class Lacks)
                                              import Record as Record
                                              
                                              addColor ::
                                                forall r.
                                                Lacks "color" r =>
                                                String ->
                                                { | r } ->
                                                { color :: String | r }
                                              addColor color record = Record.union { color } record
                                              
                                              main :: Effect Unit
                                              main = do
                                                Console.logShow $ addColor "red" { foo: 1 }
                                               {- this won't compile
                                                  Console.logShow $ addColor "red" { color: "blue" }
                                               -}
                                              
                                              1. 1

                                                Fascinating. How is this Lacks type class implemented? Compiler magic?

                                                1. 2

                                                  It’s part of the Prim.Row builtin.

                                          2. 4

                                            It’s a kind of structural typing, so, I suppose it’s a kind of static duck typing.

                                            In general, with all of these, the more flexible the system the harder it is to work with and the less capable the inference is. Row typing is one way to weaken general structural subtyping by limiting the subtyping relationship to that which can be generated by the record/row type. So Row types are a bit easier to infer.

                                            1. 4

                                              I wrote a post a long time ago about differences between row polymorphism and (structural) subtyping: https://brianmckenna.org/blog/row_polymorphism_isnt_subtyping

                                              1. 1

                                                Oh I just realised my post was mentioned at the bottom of the posted one. Awesome!

                                            1. 9

                                              This is funny because I’ve always thought the lack of Turing-completeness is a non-issue. That is, it’s fine (and often necessary) for a config language to be Turing complete. (This doesn’t mean it needs global variables or mutation, etc.)

                                              The real issue is side effects (I/O), which is not technically related to Turing-completeness. Configurations should be meaningfully versioned (e.g. in git) and side effects destroy reasoning about them (i.e. reading files outside of version control).

                                              So the author of Dhall actually agrees, but he promotes “non-Turing complete” anyway because it’s a “signaling mechanism”? Honestly that feels a bit like talking down to your audience.

                                              1. 7

                                                Totality and lack of side effects are not the same. A config language that lacks side effects but isn’t total could stall forever (think recursion).

                                                Dhall does have blessed side effects (reading from paths) but is total.

                                                1. 6

                                                  It sounds like you’re just restating the point my comment made and that the post made …

                                                  If the side effects are controlled, that’s a good middle ground. That part makes sense. What I find silly is that they’re advertising features that are non issues because it’s a “signaling mechanism”, in their words.

                                                  1. 3

                                                    I think you might be misunderstanding what Gabriel intended by “signalling mechanism”. Camp A want to ensure their configuration is safe. They’d like the benefits of camp B that you get from Turing completeness, but without sacrificing the safety side of things. The “signalling mechanism” is a way to show camp A that they can get practically all the benefits of camp B while maintaining safety. This is quite the opposite of talking down to the audience.

                                                    I’m in camp A for the most part for a good reason. I used to work for a company that did online survey software, and there was a language that the software used for describing the surveys. They could get complicated, almost to the point of requiring a real programming language, and we needed a way to ensure that (a) the processing of any such surveys could never end up hogging resources on our servers and (b) that anybody filling out the survey wouldn’t end up trapped in an unending loop of questions. We had to treat our customers like hostile agents (even if unintentionally so) and anything they provided to us as inherently dangerous.

                                                    The solution was to guarantee that the language was primitive recursive by enforcing limits on loops and that the question graph had no cycles. That ensured that the configuration language we provided to our customers was safe in a similar, albeit more limited, way to Dhall.

                                                    Turing completeness is great when you can trust whoever is writing the code, but avoiding it means that you avoid lots of potentially complicated, error-prone sandboxing mechanisms you would need to introduce if you’re accepting it from parties you can’t necessarily trust.

                                                    1. 2

                                                      we needed a way to ensure that (a) the processing of any such surveys could never end up hogging resources on our servers

                                                      The article claims that you can have Dhall functions which take longer than the age of the universe to complete, so it doesn’t really seem like this concern is addressed, even though it’s not technically infinite recursion.

                                                      1. 1

                                                        Yes, it does, because Dhall allows total functions, whereas my emphasis was ensuring everything was primitive recursive. I was using the project I worked on as a demonstration of why avoiding Turing completeness is a useful property, not specifically why the route Dhall itself uses.

                                                        We both were looking for different things: I was taking data from untrusted sources, so I needed to be able to enforce strict bounds, whereas Dhall loosens things up a bit while avoiding general recursive functions.

                                                        Dhall only allows total functional programming, which means that it guarantees that any function described in it will terminate. This narrows down the possibility of something going wrong algorithmically substantially; the blast radius for total functional programs is much smaller than that for general recursive ones.

                                                      2. 1

                                                        I’m not following. From the post:

                                                        The absence of Turing completeness per se does not provide many safety guarantees

                                                        If you want to ensure that evaluating configs from untrusted sources doesn’t take up server resources, that’s a valid goal.

                                                        Using a non-Turing-complete language doesn’t achieve that goal. Dhall doesn’t achieve that goal. He provides the Ackermann function to show that.

                                                        If you want to disagree that it’s “talking down”, then fine, we can agree to disagree. But I have been confused by Dhall for a long time precisely because of the messaging as “non-Turing complete”.

                                                        Some context (requires login):

                                                        https://oilshell.zulipchat.com/#narrow/stream/121540-oil-discuss/topic/Config.20Features.20(was.3A.20Small.20vs.2E.20Big.20Languages)

                                                        1. 1

                                                          Using a non-Turing-complete language doesn’t achieve that goal.

                                                          It can do, but you need stricter guarantees than Dhall gives, such as only allowing primitive recursion, whereas Dhall allows total functions.

                                                          Dhall doesn’t achieve that goal. He provides the Ackermann function to show that.

                                                          Yeah, because unlike my example, Dhall is trying to give you most of the power of a general Turing complete language while reducing the blast radius down. As Dhall allows only total functions, any function declared in Dhall will provably complete; it may take a long time, but they will at least complete.

                                                          1. 1

                                                            Quoting myself from 2 comments above:

                                                            It sounds like you’re just restating the point my comment made and that the post made

                                                    2. 2

                                                      Is infinite recursion an actual concern that happens to people enough to affect their language choice?

                                                      Not only has it never happened to me, I have never once even heard of it happening. Beyond that, if it ever did happen, it would be really obvious that it happened, and the stack trace would lead you directly to the function causing the recursion. (Unlike, say, a null reference, which can easily have the cause and effect separated by an immense distance, making debugging difficult.) So it seems like a very strange thing to design a whole language around avoiding.

                                                      1. 2

                                                        Right exactly – not only is the messaging focused on an irrelevant thing, but the language design is too! It would almost make more sense if the messaging were the only odd thing.

                                                        I’ve made a similar observation about statically typed config languages. Configs generally have a very limited space of inputs, a common one is:

                                                        (dev, staging, prod) x (foo, bar, baz)

                                                        where foo, bar and baz are data centers.

                                                        So if you want to make sure your config doesn’t crash, you could:

                                                        1. Design/implement a type system to prove something about all possible inputs
                                                        2. Simply evaluate the config on all possible inputs, which is 9 inputs here.

                                                        The config should take about 1 ms to evaluate and so maybe your tests takes 9 ms …

                                                        This is a toy example but it’s closer to what happens in real life IME, and I have dealt with systems with tens of thousands of servers, dozens of data centers, that are 10+ years old, millions of lines of code, thousands of lines of config, etc.

                                                        1. 1

                                                          Is infinite recursion an actual concern that happens to people enough to affect their language choice?

                                                          Not only has it never happened to me, I have never once even heard of it happening. Beyond that, if it ever did happen, it would be really obvious that it happened, and the stack trace would lead you directly to the function causing the recursion.

                                                          While I agree that accidental use of direct infinite recursion is rare, bugs related to infinite recursion seem quite common to me in functional programming (depending on your language and platform).

                                                          As an example, one class of bug would be accidental eager consumption of an infinite list. Here’s some elixir code where I intended to sum the first 10 items from an infinite list of integers but forgot to only take the first 10.

                                                          defmodule Main do
                                                            def main do
                                                              0
                                                              |> Stream.iterate(&(&1 + 1))
                                                              # |> Enum.take(10)
                                                              |> Enum.sum()
                                                              |> IO.puts()
                                                            end
                                                          end
                                                          
                                                          Main.main()
                                                          

                                                          This code will run forever without blowing the stack because of the way elixir (erlang) handles tail call optimization (TCO).

                                                          I’ve also seen bugs like this in mutually recursive code or recursive data types in languages that natively support TCO.

                                                          So it seems like a very strange thing to design a whole language around avoiding.

                                                          The concept of totality is deeper than “no recursion”. It means our code always “makes progress” toward an answer. I’m not an expert, nor do I mean to speak for the author, but I did find this[1] interview with him very fascinating. It seems to me that a lot of the motivation around totality in dhall was to safely support loading files from paths (URLs or local paths).

                                                          [1] - https://www.se-radio.net/2019/08/episode-375-gabriel-gonzalez-on-configuration/

                                                    1. 15

                                                      … compared to C or C++.

                                                      Compiling the D compiler (dmd), runtime and standard library on my machine takes 18s (I just measured). Granted, that doesn’t include cross compilers but ldc can handle cross-compilation. Building the compiler does have a dependency on gcc or clang though since there’s still some C++ code in there but nearly all of it is in D.

                                                      Vendoring: the author mentions the practice of header-only libraries in C/C++. That’s a problem specific to those languages, and has everything to do with the lack of a package manager, nothing else. Hardly any other language has these problems, vendoring in Rust, D, etc. is just as easy as in Go. It’s usually not done because the versioning problems that plagued Go in the beginning never happened in the first place.

                                                      In a post with 3 points on how Go’s tooling is great, 2 are common in basically any language that isn’t C or C++ (package management and vendoring), and the other (build speed) isn’t something that only it has.

                                                      1. 6

                                                        It’s not just “has package management”, it’s that it’s good. Which is actually a recent thing; before late 2018 vendoring was the only way to get reasonable behavior for production projects. There were a few different projects out there for doing version pinning, to save you from having to check in a vendor tree and manually manage everything, but they were all flawed in some way and all incompatible. But then go modules showed up and, surprisingly, did almost everything right, did it using only the core tools, and did it in a way that made pretty much every use-case easier, not just the Google-favored one. At this point I would recommend that other languages look to go get and go mod for inspiration, because compared to them, everyone else has warts.

                                                        1. 3

                                                          Apparently I need to look at this again, when I last used Go in earnest (mid-2017) everything was just a broken mess, and glide was the only thing that let us create a proper build at all. And yes, I did read about the “new” (has it been 1-2 years already?) discussion about mod and vendor, but I didn’t look into it.

                                                          If it’s really that good.. reminds me of early Java (1.0-1.3?) where the myth that is was slow persisted for 10 or more years on…

                                                          1. 4

                                                            It’s very different, and much better.

                                                            1. 4

                                                              Yeah, we used Glide where I work. It was IMO the best available at the time, and it was almost tolerable, but had some huge annoyances, and was very slow. We switched everything to modules as soon as it was practical. All the weird workarounds in the build scripts are gone, builds are faster, it doesn’t flake out like glide sometimes did, and local dev instructions are now just “check out and go get”, no telling people to have glide installed, no explaining about GOPATH, etc.

                                                            2. 3

                                                              It’s not just “has package management”, it’s that it’s good.

                                                              What is better about it than, say, cargo? Other than “it’s decentralised”, since I really don’t see how that’s an advantage.

                                                              1. 4

                                                                Haven’t used cargo. I find Rust interesting, but even as a polyglot I don’t have the brainspace for Rust and Go at the same time, and Go pays the bills.

                                                                1. 3

                                                                  Russ Cox has written rather extensively about the rationale behind the Go modules (formerly “vgo”) design. The “The Principles of Versioning in Go” (bottom one) gives a good summary and comparison to tools like Cargo.

                                                                  It’s hard to summarize that article in a paragraph since the topic is a bit complex, but the tl;dr is that Go modules uses “minimal version selection” instead of a SAT solver. Modules declare “I need at least version X”, and Go will use the minimal version, instead of a SAT solver to try and figure out what the newest version we can use (…probably).

                                                                  I also think the UX of modules is pretty good, and it’s pretty fast (the amount of time I’ve spent waiting on bundler and dep to finish is not insignificant).

                                                                  1. 3

                                                                    I read the article discussing the minimal version selection when it came out. I’m still not convinced. Maybe it’s a better default even, but I don’t see how it’s better than other package managers in practice.

                                                                    1. 2

                                                                      I don’t know whether or not it’s “better”; I guess time will tell. It’s certainly different. Personally, I think it’s a good thing that Go is trying something different and at least tries to improve, instead of “just copy what everyone else if doing”. That’s useful even if it turns out to be a dismal failure in a few years’ time (not that I expect that it will).

                                                                  2. 2

                                                                    For me, the practical answer is that go modules are built directly into the language tooling and therefore “just work” as you use go. Because importing an external package requires using the full URL, you can simply add new imports to your go programs and run go build. This automatically finds your dependencies, downloads them, updates your go.mod file and then builds your program.

                                                                    The entire experiences is almost completely invisible and friction-less. This isn’t to say cargo is bad (it’s not), but the experience here does feel like it is on a new level.

                                                                    I was very skeptical of go mod and MVS when the initial posts were published, but using the tools in practice has been fantastic.

                                                                  3. -1

                                                                    Can you really consider it good? There’s no concept of a central repo and it doesn’t even handle versioning.

                                                                    1. 4

                                                                      What do you mean with “doesn’t even handle versioning”? go get example.com/foo@v1.5.3 will get exactly that version?

                                                                      The lack of a central repo is a feature, IMHO. It was addressed in this blog post as well so I won’t repeats the points.

                                                                      1. 1

                                                                        There’s no concept of a central repo

                                                                        That’s very good.

                                                                        and it doesn’t even handle versioning.

                                                                        Sure it does, unless I misunderstand your meaning.

                                                                  1. 3

                                                                    This does seem like an ideal use case for depedent types. Here’s one solution in Idris (note I’m using Int instead of Rational for convenience):

                                                                    module Main
                                                                    
                                                                    data Currency
                                                                      = USD
                                                                      | EUR
                                                                    
                                                                    implementation Show Currency where
                                                                      show USD = "USD"
                                                                      show EUR = "EUR"
                                                                    
                                                                    data Money : Currency -> Type -> Type where
                                                                      MkMoney : (c : Currency) -> Int -> Money c Int
                                                                    
                                                                    implementation Show (Money c Int) where
                                                                      show (MkMoney c i) = (show i) ++ " " ++ (show c)
                                                                    
                                                                    plus : Money c Int -> Money c Int -> Money c Int
                                                                    plus (MkMoney _ x) (MkMoney _ y) = MkMoney _ (x + y)
                                                                    
                                                                    main : IO ()
                                                                    main = do
                                                                      putStrLn $ show $ plus (MkMoney USD 1) (MkMoney USD 2)
                                                                    
                                                                      -- this won't compile
                                                                      -- putStrLn $ show $ plus (MkMoney USD 1) (MkMoney EUR 2)
                                                                    
                                                                    1. 3

                                                                      One thing I always wonder with the dependent types is what happens when you find out you want to move something from compile time to runtime. E.g., suppose requirements changed and your program now has to run with a configuration file that lists allowable currencies. Could you bring those values to the type level, or would that change mean rewriting all the types?

                                                                      1. 3

                                                                        I’m still quite new to dependent types (an enthusiastic learner!) but my understanding here is that with untrusted input, you need to convince the compiler that it is an instance of a particular data type (similar to parsing in Haskell or other MLs). Once you do so, you get the same power described above.

                                                                        This does assume that you have the correct representation of the types inside the code as well. I’m not sure if there’s the ability to generate types from some outside configuration, though I do know that any type-level code in Idris must be total and pure. I’m guessing that things like “reading from the filesystem while compiling your program” are a no-no.

                                                                        1. 3

                                                                          Idris does indeed have type providers, which give you IO actions evaluated at compile-time to give a constant value available at run-time, and (of course) the return value can even be a type!

                                                                          1. 1

                                                                            Fascinating, thanks for sharing this.