An excellent but under-recognized alternative syntax for configuration files is NestedText, where everything is a string unless the ingesting code says otherwise, and there is no escaping needed ever.
I don’t write copious amounts of the stuff but I don’t understand this repetitive criticism and alarm. Because I can’t be alone in having created the simple rule: if you want a string value in yaml you quote it.
The on-key is the only one I would have stepped on as I didn’t know you could have non string keys with the colon syntax.
Sure, but then why not just use a language where all strings are quoted? We can and should use other better formats. There’s a lot of inertia around using YAML for tooling configuration, but it just seems like a waste to me.
Caddy gets it right: have JSON be the canonical configuration language but provide your own DSL that is nicer to work with that you transparently convert into JSON.
I tend to use TOML when it’s a choice but that’s not perfect either. I run into YAML mostly when dealing with k8s admin and application development and while I can just use -o json if I want. Which I do if I’m piping it to jq for some manipulation or analysis but reading and writing i tend to prefer YAML because it will be indented and succinct.
I guess it helps that I have strong editor support in emacs these days, I remember first time I tried to manage the whitespace driven format, just like with python, was incredibly frustrating to get syntactically valid documents.
The problem is everyone has to learn this lesson by messing up first :)
Long discussion yesterday about what to do with Microsoft .NET OpenAPI/Swagger specs, which don’t quote strings correctly and so break downstream clients of multiple services we have that have country codes as enums..
Obviously we should raise a ticket or contribute a fix, but it remains a bunch of hours sunk that would have been spent on productive work if YAML hadn’t been the format of choice
I looked at using YAML for something last year and the thing that really stood out is that there is basically only one implementation and it’s a library that is over a MiB in size. That’s far too big for anything that’s parsing untrusted data. The fact that it’s too big for an independent implementations to exist is scary. You can write a JSON parser in a couple of hundred lines of code, including copious comments.
I wish things that used YAML would use JSON internally and just provide a YAML to JSON tool. At least then you’d be able to look at the JSON and see if it’s surprising.
I’d also add UCL to the list of alternatives. It has a JSON object model, so is easy to convert to JSON for interchange (libucl can do this) but it also supports all of the things I want from config files such as being able to include other files and merge different object with different policies, so I can provide a set of defaults and allow the user to override them.
I wish things that used YAML would use JSON internally and just provide a YAML to JSON tool. At least then you’d be able to look at the JSON and see if it’s surprising.
I find the files that actually use Cue’s features to look foreign to me.
Indeed! It’s the only one that has a mental model that actually makes sense for configuration…but that mental model is quite different from templating. Once I got my head around it, though, it is one of those cases where you look back and say, “Why isn’t this how all such systems work?”
I personally find yes and no more readable than true and false in pretty much every situation, and find it sad that more programming/config languages don’t use them. For example, when you have something like {enable = true}, what’s “true” about enabling something?
Json itself isn’t really suitable for configuration files IMO.
Of course there’s the problem of comments, that the author mentions, but there’s more:
you can’t have trailing commas, which means you often modify other lines when you add or remove one
bracing every key with quotes is painful and unnatural
bracing the whole between braces adds an ugly an unnatural level of indentation
My preferred format for configuration file, which I use in several of my (rust or js) programs, is Hjson. Contrary to when I was using YAML, TOML, I never had any user confused by why their configuration file wasn’t working.
People have been complaining about YAML for years. I personally don’t shoot myself with the footguns enough to think about it much, but when articles like this come around, I wonder why there isn’t more movement toward TOML. Seems like it has some missing features from JSON without the footguns of YAML, but maybe the grass is just greener?
JSON didn’t quite exist yet, though I recognize that later versions of YAML have tried to do a “superset of JSON” thing, but I’ve basically only ever heard the creators mention that at all. YAML was aiming for highly expressive, readable, and round-trip native serialization, so it has this broad set of features and misfeatures. You can easily represent a memory cycle with YAML pointers; you can use event-based parsing and have several distinct documents in a stream and that’s part of the core spec.
JSON is pleasantly minimal in comparison but it never wanted to support all of that. But no one looked at the tiny JSON spec, decided it didn’t have enough multiline string options or obscure hash-in-array-or-was-it-array-in-hash whitespace quirks, and sketched up YAML over it. And also, outside of core JSON, it’s grown competing specs for chained documents (json-seq, json-lines), or standards to allow comments, or commas, or to tag native data types for round-trip serialization. We couldn’t leave well enough alone! It could’ve been so simple…
An excellent but under-recognized alternative syntax for configuration files is NestedText, where everything is a string unless the ingesting code says otherwise, and there is no escaping needed ever.
I don’t write copious amounts of the stuff but I don’t understand this repetitive criticism and alarm. Because I can’t be alone in having created the simple rule: if you want a string value in yaml you quote it.
The on-key is the only one I would have stepped on as I didn’t know you could have non string keys with the colon syntax.
Sure, but then why not just use a language where all strings are quoted? We can and should use other better formats. There’s a lot of inertia around using YAML for tooling configuration, but it just seems like a waste to me.
Caddy gets it right: have JSON be the canonical configuration language but provide your own DSL that is nicer to work with that you transparently convert into JSON.
I tend to use TOML when it’s a choice but that’s not perfect either. I run into YAML mostly when dealing with k8s admin and application development and while I can just use
-o json
if I want. Which I do if I’m piping it to jq for some manipulation or analysis but reading and writing i tend to prefer YAML because it will be indented and succinct.I guess it helps that I have strong editor support in emacs these days, I remember first time I tried to manage the whitespace driven format, just like with python, was incredibly frustrating to get syntactically valid documents.
The problem is everyone has to learn this lesson by messing up first :)
Long discussion yesterday about what to do with Microsoft .NET OpenAPI/Swagger specs, which don’t quote strings correctly and so break downstream clients of multiple services we have that have country codes as enums..
Obviously we should raise a ticket or contribute a fix, but it remains a bunch of hours sunk that would have been spent on productive work if YAML hadn’t been the format of choice
I looked at using YAML for something last year and the thing that really stood out is that there is basically only one implementation and it’s a library that is over a MiB in size. That’s far too big for anything that’s parsing untrusted data. The fact that it’s too big for an independent implementations to exist is scary. You can write a JSON parser in a couple of hundred lines of code, including copious comments.
I wish things that used YAML would use JSON internally and just provide a YAML to JSON tool. At least then you’d be able to look at the JSON and see if it’s surprising.
I’d also add UCL to the list of alternatives. It has a JSON object model, so is easy to convert to JSON for interchange (libucl can do this) but it also supports all of the things I want from config files such as being able to include other files and merge different object with different policies, so I can provide a set of defaults and allow the user to override them.
Caddy works like this: https://caddyserver.com/docs/config-adapters
Nice to see Nix being mentioned :)
Did you known about
KDL
? https://kdl.dev/ It’s basically a nicer XML but only for data (not markup)Except jsons spec is horribly under-defined and wonky as well.
Indeed! It’s the only one that has a mental model that actually makes sense for configuration…but that mental model is quite different from templating. Once I got my head around it, though, it is one of those cases where you look back and say, “Why isn’t this how all such systems work?”
I personally find
yes
andno
more readable thantrue
andfalse
in pretty much every situation, and find it sad that more programming/config languages don’t use them. For example, when you have something like{enable = true}
, what’s “true” about enabling something?Json itself isn’t really suitable for configuration files IMO.
Of course there’s the problem of comments, that the author mentions, but there’s more:
My preferred format for configuration file, which I use in several of my (rust or js) programs, is Hjson. Contrary to when I was using YAML, TOML, I never had any user confused by why their configuration file wasn’t working.
People have been complaining about YAML for years. I personally don’t shoot myself with the footguns enough to think about it much, but when articles like this come around, I wonder why there isn’t more movement toward TOML. Seems like it has some missing features from JSON without the footguns of YAML, but maybe the grass is just greener?
There’s a misunderstanding about YAML’s complexity, it comes up often, and it’s in the first paragraph here. YAML aimed to be a friendly alternative to XML. It’s not 1:1 with XML but kinda wanted to support at least as much.
JSON didn’t quite exist yet, though I recognize that later versions of YAML have tried to do a “superset of JSON” thing, but I’ve basically only ever heard the creators mention that at all. YAML was aiming for highly expressive, readable, and round-trip native serialization, so it has this broad set of features and misfeatures. You can easily represent a memory cycle with YAML pointers; you can use event-based parsing and have several distinct documents in a stream and that’s part of the core spec.
JSON is pleasantly minimal in comparison but it never wanted to support all of that. But no one looked at the tiny JSON spec, decided it didn’t have enough multiline string options or obscure hash-in-array-or-was-it-array-in-hash whitespace quirks, and sketched up YAML over it. And also, outside of core JSON, it’s grown competing specs for chained documents (json-seq, json-lines), or standards to allow comments, or commas, or to tag native data types for round-trip serialization. We couldn’t leave well enough alone! It could’ve been so simple…
Both are terrible for config files.
My observations on this:
I’d be happy if UCL became more popular, or a derivative of it like HCL.