1. 44
  1. 12

    I feel this way a lot. I don’t, as a rule, get ‘everything is terrible’ about nearly anything in programming — even JavaScript — but seeing how YAML is used so often, even in places that seem to care about good code and maintainability, gets me there.

    YAML itself is only half the story, of course. If we were configuring the same types of programs we had even five years ago in YAML, it wouldn’t raise an eyebrow. But we are absolutely awash—kubernetes being the obvious example—in enormously complex frameworks with enormously complicated control languages, and by default they come in YAML.

    It’s that software’s seductive quality that really makes me depressed. On the one hand things like kubernetes or complex build pipelines (theoretically the pipeline I’m describing could be for some really impressive data science jobs, but let’s face it: it’s your CircleCI config) strike the enthusiastic user as industrial strength and The Right Way To Do It. But on the other they are part of the nocode revolution! They can advertise themselves as inherently more maintainable and accessible because they don’t involve any programming; merely plugging together predefined parts.

    Anyone whose day job involves the phrase ‘serverless’ should know better, of course. These types of tools are astonishingly complex. And the fact that their programming languages tend to be expressed as layers over configuration languages makes the maintenance problems with that worse, not better. There are, as the author points out, no ACTUAL semantics (or even really syntax) to these languages. Nothing aside from what the current version understands. As a result the meaning of a YAML configuration file shifts constantly, in ways that can be difficult to see and even harder to control.

    1. 3

      I do disagree on the YAML example in Ansible, though. Yes, you could in theory write “programs”, but every single time someone wrote

      # main.yaml
          - include: foo.yml
            when: something == "foo"
      

      in a different configuration system this could’ve been written as

      # main.ini
      [something:foo]
      include 'foo.yaml'
      

      You simply NEED to have conditionals of some sort and as long as you don’t put them 3 layers deep one syntax will probably be as good as another. I’m not defending Ansible syntax here (pretty sure it could be better than YAML) but I like it more than Puppet’s and shell provisioning scripts, so that’s already the winner out of 3.

      1. 5

        The author seems to be advocating for restricted (total) programming languages like Dhall to be used for configuration.

        That allows you to express conditionals in a more standard (and extensible) way, which would probably be nice.

        1. 4

          Hi, I’m the author.

          Yeah, I am really excited about Dhall. I think this is the future or something like it. It supports the types of abstractions that we need without the mess of full templating or full Turing completeness.

          The one downside to Dhall is you really want to have an implementation for it in each common language. You can use it to generate YAML, but I think it would be better if tools understood Dhall and that is a bigger task because it is a more complicated implementation.

          Let’s build Dhall implementations for every major language, convince Gabe to format things in a way that makes it look more familiar to non-haskell people, and consider this problem solved.

          (The formatting thing seem to come up whenever I try to convince someone to use Dhall. People HATE the commas first and never get past that)

          1. 5

            I agree with the problem, but disagree with the solution. Total languages don’t give you any useful engineering properties, but lack of side effects do (and Dhall also offers that).

            From the surface, it seems like HCL, jsonnet, and Cue, are just as suitable as Dhall, and probably more familiar (to varying degrees). Tcl would be good too because it can be sandboxed, but dynamic types are better than its “stringly typed” nature for modern systems.

            My comment on the Turing complete vs. side effects issue here: https://news.ycombinator.com/item?id=26277963

            Previous thread: https://lobste.rs/s/gcfdnn/why_dhall_advertises_absence_turing

            Also related: https://lobste.rs/s/wrx53b/turing_incomplete_languages

            1. 2

              Wow, great response!

              At a cursory level, without knowing all of those well, I think you might be right. Certainly, it is possible to write some DHALL that couldn’t finish evaluating in the time left in the universe, so I agree termination is probably not the most important part.

              I have used HOCON a lot and found it work well. I didn’t know I was missing function calls and imports, though.

              I don’t know HCL well, but it does seem like it could fit the job. Does it have a semantic integrity check on remote imports?

              I would like to see the future format not be a YAML generator but the actual way things are configured. Whether that format is Dhall or HCL or whatever.

              To do this, parsers for each common language are needed. I think this is why YAML succeeded. If apps just took JSON as config, very few people would be writing YAML and generating JSON to configure things with, even though that is possible.

              1. 2

                I was thinking this about Total languages, too. I think what I really want for running untrusted code is a capability language and what I want for a config language is usually either something simple like TOML or, if it’s complicated, a programming language I’m already familiar with.

                1. 2

                  Right exactly, capabilities are a mechanism to control I/O, so it’s another way of saying the same thing. If your program can just compute something without observable side effects, then it’s no problem. Quite the contrary: it’s useful and natural.

                  Tcl and Lua both provide control over side effects (since they’re meant to be embedded). Lua was originally designed as a config language. It’s definitely usable that way but I would prefer something else (Oil, haha)

                  1. 1

                    Capabilities let you control access to other parts of the program, too, which is also relevant to me, but lots of programming languages let you do this if you’re careful, I suppose.

              2. 2

                Have you checked out PureScript? They have settled on a beautiful way to use Dhall: spago, the package manager.

                1. 1

                  Somebody mentioned spago on hn. I haven’t had a chance to look into it but I assume it not bridging through YAML, but directly mapping in, including types? If so, sign me up :)

            2. 1

              I’m not defending Ansible syntax here (pretty sure it could be better than YAML) but I like it more than Puppet’s and shell provisioning scripts

              I like Puppet. It doesn’t try to pretend it’s not a programming language. You get proper variables, types, and control flow.

            3. 2

              Yeah I’m in the same boat, I can handle JS and PHP and appreciate a bunch of things about them [1]

              But I can’t really handle the YAML. My prediction at the beginning of the year was that we will see more shell scripts and programming languages embedded YAML in the future, and this post shows another example I didn’t know about (Grafana? never used it).

              https://lobste.rs/s/v4crap/crustaceans_2021_will_be_year_technology#c_ker4jn

              Oil parses Ruby-like blocks so they can be the “missing declarative part” of shell [2] And we will have sandboxed subinterpreters to eliminate side effects [3]

              So it’s about 30-50% there … When it is, I will probably start using it in Oil’s own continuous build.

              [1] I wrote my first PHP ever in December and largely liked it : https://lobste.rs/s/v4crap/crustaceans_2021_will_be_year_technology#c_lkj0ej

              [2] https://lobste.rs/s/6oxpe3/s_lot_yaml#c_mje209

              [3] https://github.com/oilshell/oil/issues/704

            4. 8

              I secretly really enjoy writing YAML because it’s fun to see how “human readable” I can get the file to look. Don’t get me wrong, YAML sucks and I never ever recommend it and I’m really hesitant to load YAML code written by others.

              It just… sucks in a fun way. That’s why this INTERCAL comparison is so spot on. I’ve been thinking about it in the exact same way, like other “challenge languages” like Brainfuck.

              A good language (I really love s-exps) is good at two things:

              • making it clear how the computer will eval the code, and
              • making it clear to humans roughly what is meant here

              YAML is really, really bad at the first of those two but curiously fun at the second. It’s not automatically good at the second, but, with work and luck and testing, it’s possible to write “human-readable pseudocode” in YAML that also happens to DTRT computationally.

              As in: if I write “human-readable pseudocode” in YAML, it might not DTRT. If I write YAML code that DTRT, it might not be very human-readable. But with effort and care it’s possible to write YAML code that does both. And that challenge is… really fun.♥ (Not for work purps obv but for something like maintaining a blog or other hobby server.)

              1. 6

                Ah, Greenspun’s 10th rule again: Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.

                1. 5

                  At this point, let’s just use Scheme (or Lisp), but the main problem remains: customs semantics will always require the context to interpret a file. I would love to see HJSON and JSON-Schema support in editors so that people can write config files that conform to a schema, with completion and inline help for context.

                  1. 7

                    Or we can use XML and take advantage of all the XML schema tools and XML editors. ;)

                    1. 2

                      This, of course, being exactly how we got to a place where YAML could seem like a good idea in the first place.

                      1. 4

                        From my experience, when people get past the stage of irrational hatred for XML, they tend to appreciate the advantages and the tooling ecosystem that has existed for twenty years. You need to be quite insistent to get them into it initially though.

                        1. 1

                          There are piles of use cases that XML suits fairly well, and I’m sure examples multiply where it’d be far more pleasant to deal with than YAML for representing the same structure (completely leaving aside whether that structure is a good abstraction in the first place). But I remember the years 1999-2012 too clearly to feel that my eventual distaste for the stuff in practice was anything but reasonable and well-founded.

                          I’ll grant you that an outsized portion of my feelings on the matter are down to 1) a long string of contract jobs dealing with irretrievably malformed markup by way of brute-force hack-and-slash regex methods, 2) the eventual results of the XHTML debacle, and 3) SOAP. But I think those experiences were all fairly typical of the practical outcomes of the XML years for working programmers. Especially people outside of Java & Microsoft shops.

                  2. 4

                    For me, calling yaml a programming language is the problem. Yaml.org says: YAML is a human friendly data serialization standard for all programming languages. A data serialization standard, for programming languages, not a programming language. Yet there is a determination to use it as a programming language. Some attempt it with things like jinja or ytt. Some reach a point where they can no longer shove the yaml square peg into the programming round hole, so they just have a do_the_rest: in_a_script.sh. Others have “fake.dotnotation”: foo - a string key masquerading as object.method. The whole thing is an abomination. Is it ops people having a go at dev? Devs having a go at ops? Inexperience? It’s now the first thing I look for when assessing software options - are they abusing yaml - red flag.

                    1. 1

                      While I’m happy to see INTERCAL getting some air time here, I’m a little disappointed to learn that the INTERCAL quine that was recently to the C-INTERCAL distribution (as referenced in one of the footnotes) is an enormous 96k in size – several times larger than the 19k quine I originally contributed. Meanwhile, the under-4k quine I wrote over a decade ago (http://www.muppetlabs.com/~breadbox/intercal/quine.html) languishes in obscurity! How fickle is fame.

                      1. 1

                        Hey, I recommend emailing Eric Raymond. That is how the fizzbuzz and quine got into ‘the pit’

                      2. 1

                        I’ve been working with AWS CDK for the last few months, and oh, god, it’s so much better than plain CloudFormation templates. I cannot even imagine going back to writing yaml, because just using code for such configuration works so much better.