1. 22
  1.  

  2. 9

    This is funny because I’ve always thought the lack of Turing-completeness is a non-issue. That is, it’s fine (and often necessary) for a config language to be Turing complete. (This doesn’t mean it needs global variables or mutation, etc.)

    The real issue is side effects (I/O), which is not technically related to Turing-completeness. Configurations should be meaningfully versioned (e.g. in git) and side effects destroy reasoning about them (i.e. reading files outside of version control).

    So the author of Dhall actually agrees, but he promotes “non-Turing complete” anyway because it’s a “signaling mechanism”? Honestly that feels a bit like talking down to your audience.

    1. 7

      Totality and lack of side effects are not the same. A config language that lacks side effects but isn’t total could stall forever (think recursion).

      Dhall does have blessed side effects (reading from paths) but is total.

      1. 6

        It sounds like you’re just restating the point my comment made and that the post made …

        If the side effects are controlled, that’s a good middle ground. That part makes sense. What I find silly is that they’re advertising features that are non issues because it’s a “signaling mechanism”, in their words.

        1. 3

          I think you might be misunderstanding what Gabriel intended by “signalling mechanism”. Camp A want to ensure their configuration is safe. They’d like the benefits of camp B that you get from Turing completeness, but without sacrificing the safety side of things. The “signalling mechanism” is a way to show camp A that they can get practically all the benefits of camp B while maintaining safety. This is quite the opposite of talking down to the audience.

          I’m in camp A for the most part for a good reason. I used to work for a company that did online survey software, and there was a language that the software used for describing the surveys. They could get complicated, almost to the point of requiring a real programming language, and we needed a way to ensure that (a) the processing of any such surveys could never end up hogging resources on our servers and (b) that anybody filling out the survey wouldn’t end up trapped in an unending loop of questions. We had to treat our customers like hostile agents (even if unintentionally so) and anything they provided to us as inherently dangerous.

          The solution was to guarantee that the language was primitive recursive by enforcing limits on loops and that the question graph had no cycles. That ensured that the configuration language we provided to our customers was safe in a similar, albeit more limited, way to Dhall.

          Turing completeness is great when you can trust whoever is writing the code, but avoiding it means that you avoid lots of potentially complicated, error-prone sandboxing mechanisms you would need to introduce if you’re accepting it from parties you can’t necessarily trust.

          1. 2

            we needed a way to ensure that (a) the processing of any such surveys could never end up hogging resources on our servers

            The article claims that you can have Dhall functions which take longer than the age of the universe to complete, so it doesn’t really seem like this concern is addressed, even though it’s not technically infinite recursion.

            1. 1

              Yes, it does, because Dhall allows total functions, whereas my emphasis was ensuring everything was primitive recursive. I was using the project I worked on as a demonstration of why avoiding Turing completeness is a useful property, not specifically why the route Dhall itself uses.

              We both were looking for different things: I was taking data from untrusted sources, so I needed to be able to enforce strict bounds, whereas Dhall loosens things up a bit while avoiding general recursive functions.

              Dhall only allows total functional programming, which means that it guarantees that any function described in it will terminate. This narrows down the possibility of something going wrong algorithmically substantially; the blast radius for total functional programs is much smaller than that for general recursive ones.

            2. 1

              I’m not following. From the post:

              The absence of Turing completeness per se does not provide many safety guarantees

              If you want to ensure that evaluating configs from untrusted sources doesn’t take up server resources, that’s a valid goal.

              Using a non-Turing-complete language doesn’t achieve that goal. Dhall doesn’t achieve that goal. He provides the Ackermann function to show that.

              If you want to disagree that it’s “talking down”, then fine, we can agree to disagree. But I have been confused by Dhall for a long time precisely because of the messaging as “non-Turing complete”.

              Some context (requires login):

              https://oilshell.zulipchat.com/#narrow/stream/121540-oil-discuss/topic/Config.20Features.20(was.3A.20Small.20vs.2E.20Big.20Languages)

              1. 1

                Using a non-Turing-complete language doesn’t achieve that goal.

                It can do, but you need stricter guarantees than Dhall gives, such as only allowing primitive recursion, whereas Dhall allows total functions.

                Dhall doesn’t achieve that goal. He provides the Ackermann function to show that.

                Yeah, because unlike my example, Dhall is trying to give you most of the power of a general Turing complete language while reducing the blast radius down. As Dhall allows only total functions, any function declared in Dhall will provably complete; it may take a long time, but they will at least complete.

                1. 1

                  Quoting myself from 2 comments above:

                  It sounds like you’re just restating the point my comment made and that the post made

          2. 2

            Is infinite recursion an actual concern that happens to people enough to affect their language choice?

            Not only has it never happened to me, I have never once even heard of it happening. Beyond that, if it ever did happen, it would be really obvious that it happened, and the stack trace would lead you directly to the function causing the recursion. (Unlike, say, a null reference, which can easily have the cause and effect separated by an immense distance, making debugging difficult.) So it seems like a very strange thing to design a whole language around avoiding.

            1. 2

              Right exactly – not only is the messaging focused on an irrelevant thing, but the language design is too! It would almost make more sense if the messaging were the only odd thing.

              I’ve made a similar observation about statically typed config languages. Configs generally have a very limited space of inputs, a common one is:

              (dev, staging, prod) x (foo, bar, baz)

              where foo, bar and baz are data centers.

              So if you want to make sure your config doesn’t crash, you could:

              1. Design/implement a type system to prove something about all possible inputs
              2. Simply evaluate the config on all possible inputs, which is 9 inputs here.

              The config should take about 1 ms to evaluate and so maybe your tests takes 9 ms …

              This is a toy example but it’s closer to what happens in real life IME, and I have dealt with systems with tens of thousands of servers, dozens of data centers, that are 10+ years old, millions of lines of code, thousands of lines of config, etc.

              1. 1

                Is infinite recursion an actual concern that happens to people enough to affect their language choice?

                Not only has it never happened to me, I have never once even heard of it happening. Beyond that, if it ever did happen, it would be really obvious that it happened, and the stack trace would lead you directly to the function causing the recursion.

                While I agree that accidental use of direct infinite recursion is rare, bugs related to infinite recursion seem quite common to me in functional programming (depending on your language and platform).

                As an example, one class of bug would be accidental eager consumption of an infinite list. Here’s some elixir code where I intended to sum the first 10 items from an infinite list of integers but forgot to only take the first 10.

                defmodule Main do
                  def main do
                    0
                    |> Stream.iterate(&(&1 + 1))
                    # |> Enum.take(10)
                    |> Enum.sum()
                    |> IO.puts()
                  end
                end
                
                Main.main()
                

                This code will run forever without blowing the stack because of the way elixir (erlang) handles tail call optimization (TCO).

                I’ve also seen bugs like this in mutually recursive code or recursive data types in languages that natively support TCO.

                So it seems like a very strange thing to design a whole language around avoiding.

                The concept of totality is deeper than “no recursion”. It means our code always “makes progress” toward an answer. I’m not an expert, nor do I mean to speak for the author, but I did find this[1] interview with him very fascinating. It seems to me that a lot of the motivation around totality in dhall was to safely support loading files from paths (URLs or local paths).

                [1] - https://www.se-radio.net/2019/08/episode-375-gabriel-gonzalez-on-configuration/

          3. 6

            This post reminds me of this article on pixel art. It starts with an intro of what constitutes good art (to counter the misperception that Pixel Art = Low Quality), before meta-commentating on how avoiding contemporary mediums (not pixel art) was actually a mistake, since their studio’s goal isn’t to endorse pixel art and fight “Pixel Art = Low Quality” like a purist, it’s to produce good art by “[speaking] in a language people can understand.”

            Meanwhile OP starts with an intro of what constitutes safety (to counter the misperception that Non-Turing Completeness = Security), before meta-commentating on how abandoning the term “non-Turing completeness” would be a mistake, since the goal isn’t to be pedantic on the implications of non-Turing completeness and fight “Non-Turing Completeness = Security” like a purist, it’s to fulfill desires of safety by “signaling to them that they “belong here” with the rest of the Dhall community.”

            Moral of the story: for the sake of staying relevant, don’t be a purist. (unless you wish to be a niche artist or an unknown config language.)

            1. 4

              I was initially interested due to the totality, but I didn’t look into whether some flavor of recursion was possible. The Ackermann example shows that the value proposition isn’t there. For a config language, surely a simply typed lambda calculus (no recursion) and a standard library of maps, folds, scans, etc. would be easier.

              I’ve debated having a two-mode language where the standard library and contrib is defined with primitive recursion allowed, and then the end user language disallows recursion. So the maps, scans, folds, zips, etc could be implemented in the language-1 and then used in language-2. This draws a clear line making it easier to sandbox and manage resource use.