1. 53
    1. 22

      I generally agree: YAML is not always the best format (it’s also not my most hated format, though, it is the only one with a convenient notation of larger string blocks and the 9 versions, while an annoying number, do make sense). I wouldn’t use it for configuration or as a transmission format, but it has a good track record for data documents like data files for single page apps.

      I find two arguments a bit weird though: the length of the spec.

      For 24000 words, the YAML spec comes with: a motivation section, a comparison section, an index and a list of standard schemata. YAML ain’t a small language, it also doesn’t want to be. The comparison to the JSON spec is weird. JSON’s spec is famous for being short to the point where it is misinterpreted. While a lot of things were subsequently fixed, for example a lot of the initial parsers disagreed what the root element of a JSON document was (it’s an object, a fact that the description on json.org still doesn’t document). Most famously, 1 is a valid JSON document for PHP. Also, the other comparison, TOML, dropped the ball on quite some things.

      The length of a document isn’t really an argument.

      Second, they complain about the behaviour that you can accidentally build object instantiation attacks in YAML. This isn’t a YAML feature. YAML has tags, and those tags have been used by library authors to build a convenience feature that automatically deserialises them into an object. The same attacks have been observed with all the other formats. Rails XML parser had the same issue, just the tagging format was different. The problem there was the lack of knowledge of the danger of accepting arbitrary class names from the outside on the side of library authors, not the format authors.

      In my opinion, the tagging features of YAML are an under-appreciated part of the language, actually making it a contender to XML much more then the other formats. Also, YAML adds things that not many formats have: it actually has the concept of a reference.

      There’s mistakes in the language (e.g. making no a keyword, which leads to weird bugs, like the norwegian (no) translation of padrino introducing the language key “false” instead). But in general, I think this rant puts emphasis on weird minutiae instead of having a thorough look at YAML. I would love to see a YAML that forbids the odd parts of YAML and makes it an easier language to use.

      1. 3

        The length of a document isn’t really an argument.

        The length as such isn’t, but I think the complexity is, and the length of the specification is an (imperfect) measure of complexity. To give a simple example, how many people do you think will know the difference between:

        k: |
            multi-line
            string
        

        and:

        k: >
            multi-line
            string
        

        I would wager not many. I can understand the reasoning for having both, as I can understand the reasoning for having many of the lesser-used/understood features of YAML, but it does lead to a certain amount of difficulty when you combine all of them. I think there are very few people who really understand all of YAML. The string example above is the simplest example.

        JSON’s spec is famous for being short to the point where it is misinterpreted

        Even if the JSON spec would double in size, it would still be significantly shorter than the YAML spec.

        Second, they complain about the behaviour that you can accidentally build object instantiation attacks in YAML. This isn’t a YAML feature. YAML has tags, and those tags have been used by library authors to build a convenience feature that automatically deserialises them into an object. The same attacks have been observed with all the other formats. Rails XML parser had the same issue, just the tagging format was different. The problem there was the lack of knowledge of the danger of accepting arbitrary class names from the outside on the side of library authors, not the format authors.

        That is correct, and I addressed that:

        One might argue this is not really the fault of the YAML format as such, but rather the fault of the libraries implementing it wrong, but it seems to be the case that the majority of libraries are unsafe by default (especially the dynamic languages), so de-facto it is a problem with YAML.

        As a though experiment, imagine you have the best possible specification that anyone could write. You absolutely love it. But every single implementation of this format is slow, full of bugs, has a clunky API, and is generally a pain to use. You probably wouldn’t want to use it, because even though the specification may be brilliant, using it in a practical way will be hard.

        I am merely concerned with the practical matter of it, not to point fingers at the spec authors.

        In my opinion, the tagging features of YAML are an under-appreciated part of the language

        I agree it’s probably a decent enough feature, but IMHO even the “safe load” API for most libraries is inadequate. Ideally, I’d like an API which explicitly whitelists allowed objects.

      2. 3

        I would love to see a YAML that forbids the odd parts of YAML and makes it an easier language to use.

        Have you seen StrictYAML? It might be overly cautious in what it cuts out, but it is a very nice reduction of YAML.

        EDIT: Foolish me, posting this before reading the article which shouts out StrictYAML at the end.

    2. 10

      Ruby’s load() just loads the first document, and as near as I can tell, doesn’t have a way to load multiple documents.

      Psych.load_stream for anyone who’s wondering.

    3. 6

      Where YAML gets most of it’s bad reputation from is actually not from YAML but because some project (to name a few; Ansible, Salt, Helm, …) shoehorn a programming language into YAML by adding a template language on top. And then try to pretend that it’s declarative because YAML. YAML + Templating is as declarative as any languages that has branches and loops, except that YAML hasn’t been designed to be a programming language and it’s rather quite poor at it.

      1. 2

        In the early days, Ant (Java build tool) made this mistake. And it keeps getting made. For simple configuration, YAML might be fine (though I don’t enjoy using it), but there comes a point where a programming language needs to be there. Support both: YAML (or TOML, or even JSON) and then a programming language (statically typed, please, don’t make the mistake that Gradle made in using Groovy – discovery is awful).

        1. 5

          I’m very intrigued by Dhall though I’ve not actually used it. But it is, from the github repo,

          a programmable configuration language that is not Turing-complete

          You can think of Dhall as: JSON + functions + types + imports

          it sounds neat

          1. 1

            There is also UCL (Universal Config Language?) which is like nginx config + json/yaml emitters + macros + imports. It does some things that bother me so I stick to TOML but it seems like it is gaining some traction in FreeBSDd world. There is one thing I like about it which is there is a CLI for getting/setting elements in a UCL file.

        2. [Comment removed by author]

      2. 1

        Yes! This is one of the reasons I’m somewhat scared of people who like Ansible.

      3. 1

        Yep! People haven’t learned from mistakes. Here’s a list of XML based programming languages.

    4. 5

      I use HJSON in place of Yaml and found that it offers the same benefits (Pythonic, structured configuration), without the problems (no tabs, complex documentation). It’s a little format that should be more well known!

    5. 3

      I experience pain of using it every time I have to edit i18n strings in Rails project. Even XML would be more convenient.

      Other markup formats with the same philosophy (multiple ways to represent things, indentation-based, concise) has similar problems.

      1. 7

        Can your editor not fold YAML? If not I highly recommend finding one that can.