1. 74
  1.  

  2. 36

    Our UI was showing a lot of activity in the tiny island nation of Niue. We didn’t expect to see a lot of information security incidents involving a country of fewer than 2000 people, so we suspected a bug.

    Turns out the UI was looking up locations by the two letter ISO country code. The first two characters of NULL, converted to a string, is “NU”…the ISO country code for Niue.

    1. 12

      Fun fact, the Niue TLD is very popular in Sweden, because “nu” means “now” in Swedish. So you get snappy domain names like vecka.nu, which is a single-serve website that displays the current ISO week number.

      1. 4

        In Yiddish, “Nu” means “so?”

        1. 2

          Interesting, “nu” also means “now” in Dutch, but the Niue TLD is not popular at all here. Apparently you need some domains to bootstrap the popularity.

          1. 3

            I think that is because it became known when a whole bunch of cheap second-rate brands and stores started to use .nu as a substitute for .nl and now it feels as a TLD for fake and low quality content.

          2. 1

            Serendipitously I was looking for something that could give me the current ISO week number at a glance rather than searching “current weeek of year” in Google. Bookmarked this for just those cases.

            1. 1

              You can use date +"%YW%V" for ISO weeks, date +"%YW%U" for US weeks (1st day of week is Sunday).

              BTW next week will be week 53 as NYE falls on a Thursday.

              I think I filed a bug at vecka.nu years and years ago where they missed that corner case…

        2. 21

          Ah, YAML. Convenience shortcuts are seldom convenient, and the problems they lead to can be anything but short.

          1. 5

            Most of them were removed in YAML 1.2 (2009); but most YAML implementations recognize them anyway, probably for backward compatibility:

            Ruby 2.7:

            > YAML.load "{'norway?': no}"
            => {"norway?"=>false}
            

            Python 3.9

            > yaml.load("{'norway?': no}")
            {'norway?': False}
            

            Perhaps there are options to turn this off – I didn’t check.

          2. 9

            Now that both XML and YAML are frowned upon as configuration formats, are TOML or Starlark acceptable choices?

            1. 9

              TOML and HCL

              1. 10

                Just. Use. Json.

                1. 42

                  JSON doesn’t support comments though, and generally isn’t especially easy to deal with manually IMO. There are some variants which improve on this, but if you’re going to use a “kinda like JSON but not really”-format then you might as well use something explicitly designed to be a configuration format.

                  1. 7

                    JSON supports strings, therefore JSON supports comments.

                    I just use fieldName_ for a comment string. All field names in my config are camel case so if there’s an underscore it’s a comment. People are thinking too much about this. It’s standard to use camel case in js so if it’s not that, then it’s something else.

                    1. 16

                      People are thinking too much about this. It’s standard to use camel case in js so if it’s not that, then it’s something else.

                      JSON isn’t JavaScript. In other languages, which also consume JSON, it is generally idiomatic to use underscores and not camel case.

                      Additionally, having extra random “comment” fields in your objects is unsightly, and increasingly difficult to read for humans – and nigh on impossible to then do proper strict schema checks (to avoid misspelt properties, say) on the result.

                      1. 5

                        Are these not for configs? What does it matter if comment fields are in config objects? Do you not have a syntax highlighter/prettyfier? Can you not just make validators for config objects?

                        If you’re dealing with JavaScript Object Notation I don’t see why you aren’t using camel case for that portion of your code and even if there were a good reason, pick a delimiter or suffix that works in your case.

                        This is starting to sound like people being disorganized rather than tool failure. If so much structure is needed for your projects then why not just use protobufs? They support comments in the traditional sense and go everywhere.

                        1. 5

                          Camelcase isn’t mandatory in JS and while many of the codebases/libraries are camelcased (especially now after prettier), I have worked on many codebases with underscore delimiters. Pretty common in organizations where existing products have been written in non-JS languages.

                      2. 10

                        Good heavens, that’s just ugly and unreadable.

                        1. 10

                          JSON specification does not preserve order of keys, thus generating/modifying configs by software can lead annoying situation where those “docstrings” and values are not next to each other. Depends on the platform though, obviously.

                          1. 3

                            I do this too. But then if I have type checking on JSONs, I need to update the grammars there to support those comments. But it’s a decent workaround for the current situation.

                          2. 3

                            That’s true.

                            An approach I had been using is relying on a config language eg HOCON [1]) + config language embeddable ‘interpreter’ (eg LighBend/Config [2] .

                            When selecting the above combo, I was looking to minimize repetition of config values and ability to use same config file from multiple languages.

                            The above translated into needs for for hierarchical inheritance, support for ‘truthy’ values, support overrides (by environment variables, or by extra ‘includes’), support for arrays. And the ‘embeddabiltiy’ into a variety of common programming language (eg C++, Java, Python, C#, JS)

                            HOCON also has a syntax highligter for jetbrains, so makes working on larger configs, easier.

                            It certainly lacks sophistication of Dhall, with it is incredible ability to show semantic diffs.

                            [1] https://en.wikipedia.org/wiki/HOCON [2] https://github.com/lightbend/config

                          3. 24

                            As a human readable configuration format? No thank you.

                            1. 13

                              JSON is not great for long strings that contains newlines, though, like PEM certificates.

                              1. 3

                                JSON supports an arbitrary set of data structures. It’s a good choice of ~8, but arbitrary.

                                It does not support enums. It does not support sets. It does not support maps….

                                So what ends up happening is you end up needing to define a grammar anyway (and if you don’t you’re setting yourself up for pain), so at that point, why use JSON?

                                Now of course, for pragmatic practical reasons I wouldn’t tell anyone to stop using JSON today-you’d be fired. But in 5-10 years I wouldn’t be caught dead using JSON.

                                1. 15

                                  It’s also, depending on the interpreter, good for silently mangling integers that exceed 53 bits in size.

                                2. 3

                                  Use edn. Easier to write than json, no commas bogging you down, and I believe you can comment.

                                  1. 1

                                    Edn is nice. It does have comments. No multidimensional array syntax, but otherwise seems okay.

                                  2. 3

                                    JSON doesn’t encode numbers well. By specification they cannot be NaN or Infinity and in many implementations integers will be silently mangled if their magnitude is greater than about 2^53.

                                    1. 1

                                      Personally, I prefer Lua for configuration files.

                                      1. 1

                                        I’m with you. JSON’s simple pervasiveness and no-surprises syntax still outweigh the negatives others are pointing to.

                                      2. 3

                                        Amazon’s Ion is a superset of Javascript (including comments) and has great library support. http://amzn.github.io/ion-docs

                                        1. 3

                                          I think it’s worthwhile to differentiate between configuration languages and other serialization formats or in other words to really look at what use cases they were designed for/how they were born.

                                          I i think right now toml and UCL (and HCL) are great proven, well designed formats for classical configuration needs.

                                          YAML was designed to be kinda close to something a human even in non tech would write (in an email). It absolutely makes sense for things like metadata for static sites. It’s like a nature language, so like them it has multiple ways of saying no. Just like your chatbot would have

                                          But as everything this comes with a price. I don’t think it’s so much about X vs Y for absolutely everything. I think with YAML it’s more a development of “hey I know YAML and X also uses YAML”. I think it’s in good faith of course, but I think we too rarely take a step back to really make a choice based on what the goal is. For example I am a fan of the development of using things like protobuf, but there’s certainly a lot of cases where json is a way better option and neither make good candidates for configuration. But if there’s always a UI in front of it (think a game) or a maximum of portability and just a hand full of options (name, color icon or something) I can totally imagine JSON to be something to consider.

                                          I think YAML also sufferer from the fact that it originally comes from a time where the largest player was XML and the scripting languages wanted to have an alternative for that. Once done it for simple metadata style content makes sense and just like JSON people would end up using it to store configuration even though that not always made sense.

                                          And now it certainly is overused. While I really hope that HCL/UCL and TOML pick up and become the standard way of configuring software I very much hope we don’t end up trying to make them golden bullets and use them for stuff they were not designed for or turn them into huge standards for every use case people can come up with, because that’s when people feel the need to create something new and simpler again restarting the cycle.

                                          1. 2

                                            TOML is great as an init file but an XML replacement it does not make.

                                            JSON is really a betamax vs VHS kind of situation… we’ve had S-expressions for a long time now and they are the clear winner in every way.

                                            You can make convenient denotations for different datastructures by putting a symbol in front of the opening parens like #(this is a vector).

                                            Semicolon marks the start of a comment, you could do a commented block with ;(this is a block comment)

                                            Finally, and most importantly, there are canonical S-expressions that can have binary data and be streamed over networks. Furthermore it’s relatively simple to have a standard projection from any S-expression to a canonical one. Using the vector from before as input we get: (6:vector4:this2:is1:a6:vector) as the canonical representation. This we can sign and hash and work with in various ways.

                                            What more do you want? Schemas? Sure we can do those.. just as was done for XML (which is just violence when you contrast it with sexps..)

                                            Edn is not good enough in my opinion. It’d be much better to have just pure sexps.

                                            1. 1

                                              I like the idea of S-expressions, they are very nice in theory. In practice, they have two problems:

                                              • Users need quite a bit more training to edit S-expressions than INI or TOML files.
                                              • It’s not manager-friendly to introduce “sex-pressions”.
                                            2. 2

                                              Use Tree Notation. You get zero types (just strings, which are arranged in a grid so editable by any text editor OR spreadsheet). Then you just define what types you need on top of that.

                                              Don’t need booleans? Don’t add them. Don’t need floats? Don’t add them. etc.

                                              Need sets? Add them. Need maps? Throw them in your cart.

                                              It’s the “a la carte” of notations.

                                              1. 4

                                                In your own words, tree notation is a “half-baked” idea. I like the idea, but whenever I stick it in the oven it sprouts just as much complexity as any other configuration / markup / data representation language.

                                                1. 1

                                                  Totally. And boy were there some bad sprouts that needed to be weeded out. Some of this year’s harvest is quite delicious though.

                                                  I hope someday soon it will “just work”.

                                                  The base notation seems mostly settled (at least in 2-dimensions, though there is still some interesting experiments with multi-headed parsers), but best practices in building higher level Tree Languages are still evolving at a decent clip.

                                              2. 2

                                                Jsonnet?

                                                1. 2

                                                  I find Dhall a good balance between expressive power and generating simple structures. And you get types, too.

                                                  1. 2

                                                    I agree on toml because unlike json it is not only human readable but also writable.

                                                    I use the filesystem, at one file per key, and dirs for organizing, which makes it easier to edit even with basic tools, and also to diff against defaults.

                                                    1. 1

                                                      Has anyone played with Nickel?

                                                    2. 5

                                                      I also ran into the on == “true” problem in .travis.yml

                                                      https://news.ycombinator.com/item?id=25483562

                                                      Can’t believe we actually use this…

                                                      1. 4

                                                        What surprises me that instead of losing revenue why wasn’t that tested beforehand? That looks like an obvious case to get caught during testing before rolling it out to the world.

                                                        1. 10

                                                          I don’t know how big the company is and how much revenue they lost, but in my opinion there always is a point where you have to call testing done, because more testing would not be reasonable anymore (regarding effort put in vs. risk minimized).

                                                          I assume that they tested the new feature “countries configured inside YAML” in general and it worked fine for all countries in the testing phase. Then they rolled it out to the first few countries, all fine again. And then they gradually extended it to all countries and at Norway it went bang.

                                                          If that would have meant the end of the company, they would have done something wrong. But if it’s a risk the company can handle, then with their limited knowledge before the bang they did everything right. They used their manpower to solve more important issues and had to pay a small fee for that (lost revenue in Norway).

                                                        2. 2

                                                          Makes sense to always use quotes for strings in YAML. After all, sometimes you want to represent the text “9.3” and sometimes the value 9.3 and sometimes the text “False” and sometimes the value False.

                                                          Lots of YAML formatters will deformat out the strings for you, which is a bit annoying, unfortunately.

                                                          1. 2

                                                            This is why I’m a big fan of using configuration formats with types. In practice for me, that’s meant protos, but anything where you can trivially typecheck in CI is a big win.

                                                            1. 1

                                                              I was bit by this recently in our company’s product, and it lead to a 5XX NullPointerException rather than the 4XX ConfigValidationException it should have been. It’s surprisingly difficult to fix too. The YAML parser we use supports turning off the implicit boolean conversions, but we rely on it in other cases to correctly handle our customers’ YAML. I wish it had an option to turn off implicit boolean conversion just for the keys in a map :-)