1. 51
    1. 35

      My favorite approach for complex configs is to write them in a “real” programming language, but serialize them to something like yaml or json before importing into the running program. This gives you a lot of power when you’re writing your config, but it decouples the configuration language from your running application, and preserves the advantages of distributing a static config file. So, for example, config source in python to generated yaml which is imported by your Go binary (or whatever).

      (This is part of the approach used in Configerator at Facebook, but I’ve used it elsewhere with a lot of success as well.)

      Nothing is perfect, of course: this approach can lead to config schema management challenges, as well as just the extra friction of using a config generator. But I like it a lot better than either importing running code into a live app in production, or trying to hack “loops” in yaml or other nonsense.

      1. 6

        I agree. It also gives you a nice inspection point for debugging. You can always look at the static configs. It’s a very powerful pattern.

      2. 2

        Interesting! It seems like AWS is following the same pattern with their newest cloud-config efforts (after several other mechanisms).

        Running the code vs. the extra friction might be a trade-off one should be able to choose?

      3. 1

        Configuration languages are like a protocol for configuring servers in that vision : there is rarely anything dynamic in the protocol itself.

    2. 19

      I think the principle of least power might apply here. Do you really want a full language here. Seems ripe for abuse.

      1. 11

        I think it depends on the domain. For ops work, it’s quite difficult to come up with a language that is simple enough while still being powerful enough.

        I’ve chosen Python as config file for a Docker template product I sell. Here’s my reasoning:

        You can say, ok, we’ll just have them use JSON. But now they’re writing code to generate the JSON, so we’re no better off.

        I could do something more powerful than JSON, but less powerful than Python. But—users might want to set Docker labels based on Git. So, great, I can hard code that as flag, but—what if they use Hg? What if they use Perforce? And that’s just one detail out of a thousand.

        So it’s much easier to say “Here’s a language you can mostly just use to construct dictionaries, just like JSON, but if you need it, you can do whatever”. And in my case it’s a template for Python applications, so I can assume Python is there.

        A longer write up on the pain of writing software for ops, and how I arrived at the decision to do what this article suggests: https://pythonspeed.com/articles/developing-tools-for-ops/

        1. 5

          You should check out Starlark. It looks like Python but it doesn’t let you do a lot of the crazy dynamic stuff Python can do and it doesn’t let you do I/O or have unbounded loops. It was built for exactly this purpose: https://go.starlark.net

          1. 2

            I can see places where that would be useful, but it doesn’t help with things like “call out to a git subprocess to get the current branch”. Sometimes you just need a full programming language.

            1. 1

              I think we’re talking about different use cases then. I have a particularly hard time imagining when calling out to git is appropriate for this sort of configuration. It seems like you would want to do that ahead of time and pass the result into your configuration script as a parameter. That said, this is a pretty broad space, so maybe there are valid use cases and I’m just not imagining them. Mostly I deal with CI/CD pipelines and infra-as-code.

              EDIT: By the way, I don’t think it’s inherently bad to use Python, but I do think you need to be disciplined and avoid doing I/O when you don’t need to.

              1. 1

                In that scenario the code that calls out to Git “ahead of time” is configuration from my perspective, because each organization writes its own variant.

      2. 6

        In my experience it’s nearly always a mistake, and only ever viable long-term if the configuration “program” is completely dead before the configured program starts to run. Once you can have a thread moving in and out of “configuration” code at run-time, you’re lost.

      3. 4

        Agreed. Turing complete configs and config generators have often been my personal hell. If you need that much power in your configs, something has probably gone very wrong in the design phase.

        1. 6

          I think Turing completeness is a red herring, since even with a language like Dhall you can write something that will finish evaluation after you’re dead.

          1. 3

            I think the real factor is I/O. People will try to call out to the network or do other state full things that break in weird ways.

        2. 2

          The problem isn’t Turing complete configs - they’re inevitable as you try to capture all the levels of exceptions your environments have and how those interact.

          The problem is accidentally Turing complete configs, as a language that organically grows the power needed to do non-trivial configuration tends not to grow it in ways that are best designed in a formal sense.

          1. 5

            The problem isn’t Turing complete configs - they’re inevitable as you try to capture all the levels of exceptions your environments have and how those interact.

            Yes, and the goal should be to start normalizing environments, rather than shoving more and more into an increasingly complex configuration.This is precisely what I meant by something going wrong in the design phase.

            1. 2

              It’s very rare that that’s possible. There’s production vs development vs test vs developer’s machine, each cluster has a different capacity in accordance with regional needs and redundancy requirements, some clusters are on the older hardware/OS/BIOS firmware, you’re canarying some features only in some clusters, you’re in the middle of rolling out a new software version which has different resource requirements, and some clusters don’t have some subcomponents as legal doesn’t want you running that subcomponent in that jurisdiction.

              Real life is messy, and configuration has to be able to capture it. No matter how much you work on making things consistent, there’s always going to be new variances coming up.

              1. 3

                Yes, and given the time spent on this, I regularly find the investment in simplifying the environment is not that different from going all the way on configuration, and pays dividends in system maintainability down the line.

                1. 2

                  The environment is already as simple as it reasonably can be, this is irreducible complexity. You can’t exactly tell people to stop buying new hardware, stop traffic growth, stop developing features, or stop doing releases.

                  The question then is how do you deal with managing that environment. That is the true problem of configuration/change management.

                  1. 1

                    I’m not sure how you can say that with a straight face.

                    However, if you really want to go that way, you can do even better than turing-complete configuration languages. Just hard code things. That means that you get full visibility into your setup with your monitoring tools, can step into it with your debugger, get well integrated logging, get full checking with static analyzers (which will even allow you to catch configuration bugs at compile time), and so on.

                    Because you’ve exposed your current configuration to your linter or static type checker, you’ve actually the confidence you can have in your changes.

                    And then, rolling out a new configuration is as simple as rolling out a new binary.

                    1. 2

                      I think we’re talking about vastly different systems here in terms of scale and complexity.

                      1. 1

                        I’ve worked on Google’s display ads infrastructure.

                        So, why is “write your config in a real programming language” reasonable, but “and do it in the same language, statically analyzed, tested with the code it’s changing the behavior of, deployed using the same continuous rollout pipeline, and integrated with the rest of your service’s infrastructure” a bridge too far?

                        If your objection is incremental rollouts – keep in mind that you can already have different behavior per node by filtering by cluster, hashing the node ID and checking a threshold - which you’d probably be doing in a turing complete config anyways. (Just, if it’s like most places I worked, with fewer unit tests).

                        If your objection is observability, you’re already exporting all service state via /flagz or similar, right?

                        1. 1

                          I’ve worked on Google’s display ads infrastructure.

                          I was an SRE for those systems.

                          why is “write your config in a real programming language” reasonable

                          I never said it was reasonable, and my experience is actually that imperative languages don’t work too well once you get into the really complex use cases. You want declarative with inheritance, to capture all the nested levels of exceptions within a given configuration. Even then you need to factor it carefully, and do most of the other code health stuff you would with “real” programming.

                          Because you’ve exposed your current configuration to your linter or static type checker, you’ve actually the confidence you can have in your changes.

                          I’m not seeing how any of this has anything to do with static analysis, nor how it makes a difference either way in terms of monitoring, debugging, or logging.

                          Confidence comes from seeing that the settings I want to change ended up changed in the dry run of the config change. Whether it actually has the desired ultimate effect is a different question entirely.

                          And then, rolling out a new configuration is as simple as rolling out a new binary.

                          That’s assuming that the cost of rolling out a new binary is effectively free. Building and running the automated tests for the binary could take rather a long time. That’s even worse if you want to quickly iterate on a setting on a single process before doing a broader rollout. Baking settings into binaries also doesn’t cover settings that can’t be baked into binaries, such as resource and scheduling requests.

                          1. 1

                            That’s assuming that the cost of rolling out a new binary is effectively free. Building and running the automated tests for the binary could take rather a long time.

                            Sure, but config should also be running integration tests. Especially if it’s in a complex, turing complete language. I’d be rather surprised if config changes weren’t the source of comparable numbers of outages as code changes. Either via tickling a bug in existing code that shows up when config options are changed a certain way, or simply via being wrong.

                            And the more complex your configuration, and the more power you put into the config language, the closer it gets to being “just more code”, but with worse quality control.

                            So, if you go 90% of the way and give up on fighting config complexity, you may as well go the last few percent, and give it proper code treatment.

                            1. 1

                              You’re conflating two things there.

                              Firstly how should you safely deploy a config change, presuming that it’s changing exactly the config it’s meant to change. Strategies there vary, and a code-like change (e.g. enable user-visible feature X) versus an operational change (e.g. increase the size of a threadpool) are best served by different types of testing. Having to wait for multi-day integration tests that involve multiple other humans for a threadpool tweak isn’t reasonable, on the other hand looking only at CPU usage when you’ve enabled a new feature isn’t right either.

                              Secondly how do you ensure that you’re only changing the config that you expect to change. You can get pretty far by diffing the raw output configs, as unlike standard programming we have a complete set of possible inputs (i.e. all the config files). So it is practical to do a mass diff and have a human glance over it on every change. Unittests would likely hinder more than help here, as they’d either be overly specific and just create busy work on every change - or not specific enough and never catch anything.

                              1. 1

                                Secondly how do you ensure that you’re only changing the config that you expect to change. You can get pretty far by diffing the raw output configs, as unlike standard programming we have a complete set of possible inputs (i.e. all the config files).

                                The proposal in the article was ‘import config’ as pure python code, and just execute it. There’s no raw output to diff in the author’s suggestion. And it can behave differently at runtime, inspecting the node that it runs on or the time of day, and doing different things based off that. If you’re doing that, I strongly think you’re going the wrong way. But if you’re going down that wrong path, take it all the way and integrate your “config” with the application fully.

      4. 4

        I don’t know how much you know Lua but it is a prime language for using like this. There are built-in features to create sandboxes which basically allow you to run untrusted code in a very restrictive environment. You can specify exactly which functions the untrusted code has access to and using the Lua C API, you can tie that environment into useful parts of your larger app.

      5. 2

        I think it REALLY depends upon your use case. Is this config for an internal tool that’s low impact? Maybe. Do we have to worry about hostile or under-informed users? That changes things.

      6. 1

        The problem is for many domains, more dynamism really is necessary. However, you can restrict unbounded looping and various I/O by using an embedded scripting language that supports those features, such as Starlark. https://go.starlark.net

    3. 16

      This reminds me of the complexity clock article. With anything sufficiently complicated, you will probably need the power of a programming language. Once you’re at that point, you might as well use an established language for your configuration, instead of rolling your own.

      1. 2

        Oooh, what a nice thought-provoking article! That’s going in my bookmarks. Thanks for the link.

        Once you’re at that point, you might as well use an established language for your configuration, instead of rolling your own.

        I also like the second part of their advice: “so go back to coding the config in your project’s main language, and invest in your build/deploy cycle to make config-changing [via deployment] trivial.”

      2. 2

        Whereas, folks using a language like Lisp could do every approach equally with tool support and lightning-fast deployments. That or any language that makes building DSL’s within the general-purpose language easy. If it’s strongly typed, then you get the type checks on the final output. An example would be Ivory language embedded in Haskell.

        This still supports the author’s claim that the developers should use a better “build-test-deploy” cycle.

        1. 1

          My thinking is this clock is basically “we keep picking things that don’t allow for each part of the clock simultaneously”.

          You can always add a “we only call READ not EVAL when loading config” and have users figure out how they want to generate s-expressions if limiting dynamism is important.

          1. 1

            Sounds like accurate thinking.

      3. 1

        I’ve seen this anti-pattern in a few places, and I completely agree. All programming languages are imperfect, but all of the established languages are worlds better at expressing logic than any DSL or mess of XML, JSON, etc you could ever dream up.

    4. 10

      My configurations are too much copy and paste, I need to switch to a programming language

      ( 2 years pass )

      These configuration files are ridiculous and impossible to analyse, we should switch to a declarative config

      1. 4

        (2 months pass) Oh, this declarative config is inflexible, we should template it with jinja

      2. 3

        Also the person who wrote their special snowflake of a configurator probably left the organization and didn’t document any of it. And it’s the most critical thing in the pipeline.

    5. 8

      Someone said a while ago something that’s stuck with me:

      Programmers think of computers as pets. They spend a lot of effort on custom things for individuals. Sysadmins think of computers as cattle and want to have uniformity.

      This post feels like something from someone firmly in the programmer mindset. Executable configs are great when you are configuring a thing. You can write something very concise, maintainable, and so on. They really suck when you are trying to orchestrate a fleet of a hundred machines each running different service each with their own different executable config languages and you need to push out changes. Building high-level tooling that inspects an imperative program is orders of magnitude harder than building high-level tooling that inspects a declarative description.

      Worse, if you do as the post suggests, you are baking in a specific choice of language. Python is pretty popular at the moment, but Perl, TCL, and Ruby were all equally fashionable over the last few decades. When Python goes out of style and is replaced by ShinyNewThing, how do you deal with that? If your program is written in Python, you have leaked an implementation detail into your config files. If the next version is written in something else, now you need to ship a Python interpreter and Python bindings to a bunch of objects just to parse the config file. If the default language for admins stops being Python, are you going to embed an alternative scripting language with Python bindings to support another language for configuration? Are you going to provide a transpiler to transform an arbitrary Python program (which, because it’s Python, will likely use a bunch of random packages that bind C/C++ libraries) into ShinyNewThing? Sounds difficult.

      TL;DR: I do want something executable to manage configuration, but I don’t want that built into every single app and I don’t want each program vendor to pick the language that I have to use.

      For individual programs, I can’t recommend Universal Configuration Library highly enough. It is a very fast parser for a superset of JSON that adds (among other things):

      • Includes, with overrides and priority (so you can have a defaults.d and a config.d and have every file in defaults.d included and then everything in config.d included, overriding the defaults)
      • Comments (including nested comments)
      • Numbers with human-friendly suffixes
      • Macros

      It has native C and C++ APIs, along with Lua and Python bindings. If you have UCL configurations, it’s trivial for someone to write Lua or Python scripts to parse / generate / modify them, and administrators can easily incorporate that into their orchestration framework of choice.

      1. 1

        For individual programs, I can’t recommend Universal Configuration Library highly enough. It is a very fast parser for a superset of JSON that adds (among other things):

        Thanks for bringing that up, I hadn’t heard of UCL! I would add that UCL adds a significantly more beautiful language than JSON.

    6. 6

      While json certainly sucks due to not having comments, I find I’m fine with pretty much anything else. Apache’s format is a bit annoying if I had to name one, but generally I can’t think of any non-json config that really sucks.

      1. [Comment removed by author]

    7. 5

      DHALL is designed to solve this problem. I’m glad to see it called out in the article. I think one hurdle it has to get over, is having a dhall interpreter available for each major language. This is harder than having a json parser, because it is a more complex spec.

      1. 2

        An option that can work in such circumstances is to use dhall-to-json, and use that to process Dhall configuration into JSON that the target software actually uses.

    8. 4

      In the toy example, why is exec used? Why not just from config import PEOPLE as people?

      1. 7

        Good question! Mostly the same, but I discovered few differences:

        • your config file might have a different name (or a name that can’t be a Python module name). That can be worked around with importlib though.

        • your config might not be in your PYTHONPATH, then you’d have to mess with sys.path to be able to import it

        • the biggest difference is that you can pass a context to exec.

          In my toy example, imagine that the config was supposed to modify PEOPLE (so it could remove people from the list as well). You can achieve that with:

          all_people: List = ... # some code that loads people from the database
          globs = {'PEOPLE': all_people} # execution context for the config
          exec(Path('config.py').read_text(), globs)
          # all_people is now modified by the config

          If you were to use import, you’d have to define callbacks to modify PEOPLE, which may be less aethetically pleasing (more nesting).

          As a real example: Jupiter config lets you modify the configuration as you want, e.g. append preprocessors. In config.py file, you call get_config and get the configuration object. In order to bring get_config into the scope, I’d imagine Jupiter does something similar and passes globs = {'get_config': get_config} to exec.

        So differences are fairly minor, but exec is a bit more flexible, so I prefer it now. However, for lazy and quick configuration files import config will do 95% of the job.

        1. 2

          I used import to load a config file for an old Python project years ago, and the biggest pain points were the sys.path frobbing as you mentioned, reloading after changes (there are ways to do it, reload() in Python 2, importlib in Python 3, but there were some caveats with those, although I’ve since forgotten the details), and that it cluttered the FS with .pyc and __pycache__ stuff. Maybe there were other issues as well; this was years ago. I eventually switched it to exec().

          Here’s the config for that by the way: https://github.com/arp242/battray/blob/master/data/battrayrc.py; I think it worked out rather well. It’s not hard to see how you could do with with a purely declarative config file, but basically that’s just wrapping a bunch of ifs.

        2. 1

          Passing a context to exec would be an anti-pattern to me. Explicit is better than implicit. The configuration should explicitly import whatever it needs.

          Likewise, the PEOPLE variable is magical. If you make a typo, have fun debugging that. Instead you can import a register_people function and use it. Following that pattern, there is no need to create the namedtuple Person yourself. Import the data structure from the application and you can also use the methods from it.

          I spent a while maintaining a buildbot configuration. We even had a few “unit” tests for the configuration.

          1. 2

            I don’t disagree – one is less nested, but implicit, another is more verbose, but explicit. Both can be type checked and linted (e.g. with Protocol), so I would say the differences are pretty minor.

            Yep, you’re completely right about importing People, that’s what I would do in my tools! I just wanted to keep the example simple.

            Something like jupiter config I’ve liked feels like a nice compromise? The only ‘implicit’ thing is import_config, which is hopefully not that magical (and yeah, also can be imported from the package as you suggested)

      2. 1

        Having made the same engineering choice (Python script for config, using exec), exec means you can just figure out the path to the code and be done with it. Imports require manipulating sys.path, which has the potential for breaking things if e.g. there’s another Python file in there with the wrong name.

    9. 4

      I found it interesting to learn that Lua actually kinda evolved from a config language (and an earlier scripting language), through an identical sentiment and experiences. As such, when I feel a simple .ini/.json/whatever file is not enough for me, I try to seriously consider just going straight to Lua, to avoid falling victim to Greenspun’s tenth rule. Sure, any language can work here, but I like to think Lua has config files in its DNA.

      That said, I think CUE may be an interesting technology to keep an eye on, though as of now it proved too early phase to be useful to me.

      1. 2

        I do used Lua for configuration and it’s quite nice. It’s easy to embed, trivial to sandbox, and easy enough to pull values out when needed. When I switched to using Lua to configure my blogging engine I was able to delete a whole bunch of code and replace it with:

        process = require "org.conman.process"
        process.limits.hard.cpu = "10m" -- no more than 10 minutes of CPU time
        process.limits.hard.core = 0 -- no core file
        process.limits.hard.data = "20m" -- not more then 20Meg of memory usage

        in the config file. The blogging engine runs as a CGI script, and while I think there’s a way to set such limits in the web server, I’m not sure (I never bothered looking). This way, I don’t have to worry about it.

        We also have a few programs at work that use Lua for configuration and ops hasn’t had one problem with it.

    10. 4

      Anyone knows examples of conservative static analysis tools that check for termination in general purpose languages?

      Isn’t it exactly the Halting Problem?

      For trivial examples (e.g., while(true)) grep can find that. The problem is exactly the non-trivial cases, and that is what I think the author is asking advice for.

      I still think Dhall is the best philosophical road in configuration files.

      1. 1

        Hey, I’m the author!

        Yep, if it wasn’t for “conservative”, it would be halting problem. “Conservative” means it’s allowed to reject a valid terminating program (because it’s not smart enough to figure out why it’s guaranteed to terminate). But if you program does pass the check, it’s 100% guaranteed to terminate.

        Some examples would be:

        • a simple example of a conservative subset of language would be setting a timeout for a program. So if your program takes a millisecond longer than a timeout, it would be rejected as invalid

        • a more sophisticated example is a language that doesn’t support loops, but uses something like structural recursion

          Or you could just not allow recursion or control flow in the first place, which is what Dhall does.

        So what I meant by a ‘conservative subset’ was – you can restrict your existing language (i.e. Python) to a subset that is provably terminating. For example if you forbid using for/while loops, eval/exec, imports and standard library you might get a terminating subset of the language (of course you have to analyse it thoroughly). That way:

        • your config is guaranteed to terminate in finite time
        • you don’t have to reinvent the syntax for your config – you benefit from the existing tools for your programming language

        Not sure, maybe “conservative” is not the right word for that, but I think it’s what is usually used when referring to static analysis tools, e.g. here

        1. 1

          Dhall has control flow and recursion. Your own post addresses this so your comment could probably just omit the Dhall mention.

          I’ve considered making a config language out of PureScript which would disable recursion and import a custom content-addressable package set, which would expose only those functions that terminate (map, fold, etc). Rather than creating anything new, it would restrict what is already there and maintained. PureScript is better than Haskell at talking natively to JSON, with its record system, which is why I’d choose it.

          You could perform “modifications” by replacing/updating top-level definitions of normal forms. But only normal forms can be modified directly without worry, i.e. pretty much JSON. Anything else and you would have to wrap things with setters and adders, which obscures the structure of the config. But one could explore normalization tools to “inline” code with its results to recover normal form structure.

          I think something like this is deserves more research.

          1. 1

            Dhall has control flow and recursion

            Ah, I meant it doesn’t support general recursion. In my understanding it’s got folds/maps, but no for/while analogues which makes it total, right?

            I’ve considered making a config language out of PureScript which would disable recursion and import a custom content-addressable package set.

            Rather than creating anything new, it would restrict what is already there and maintained.

            Yep, that’s exactly what I had in mind by a “subset of a language” ! Sounds really cool, did you get far with this?

            1. 1

              Sorry, wrong assumption on my part. It doesn’t have recursion, not even primitive recursion or structural recursion.

              I assumed it had some kind of iteration to achieve Ackermann; it achieves it instead with a Natural/fold which takes an arbitrary Natural for how many iterations to perform. That’s sadly enough to define a pernicious Ackermann-like function. I’m not sure that Gabriel was aware of this possibility when adding Natural/fold to Dhall; I’m sure he wouldn’t have, as the main promise of Dhall is termination, which really comes with an implicit “in short time” qualification, at least it did in my mind; otherwise you always have to add a “in n steps” or “in n milliseconds” sandbox to your configurations which is unaccaptable: what do you do when you hit this limit?

              1. 3

                No problem!

                This is exactly why I’m somewhat sceptical about termination as a necessary property, because termination in theory (i.e. in 100 years) means non-termination for all practical purposes. Ackermann is a contrived example, but once you have anything resembling functions (even withouth recursion!), you’re doomed anyway, e.g.

                function f1(x) { return concat(x     , x); }
                function f2(x) { return concat(f1(x), f1(x)); }
                function f3(x) { return concat(f2(x), f2(x)); }
                function f4(x) { return concat(f3(x), f3(x)); }
                function fN(x) { return concat(...)

                (not Dhall, but I believe this is expressible in Dhall)

                As N grows, this will take exponential time to evaluate (and also exponential memory), while the code itself grows linearly with N. Linear growth means it’s something that can be written and managed by a human, so here you go, you have something that doesn’t terminate in practice. Again, this is something contrived and unlikely to occur unless you’re being malicious, but if we’re discussing language design, that’s something to keep in mind.

                You’d get this blowup behaviour from jinja templates, or YAML, or anything that supports string interpolation or helper functions.

    11. 3

      I know what OP is getting at but I don’t like this. Config formats are usually a lib. Installing python is something else. They say “and then you’re done” but you aren’t done. Interpreted languages make you install a dev environment every time. Unless you are sharing on the web (no local files). Given that python’s easy_install / pip / pipenv / poetry / virtualenv / asdf / .tool-version / unresolved PEP specs … is a jungle of opinions and bit-rotting usability, how is this done? Let me ask you this: how do you install python? A cartoon fight cloud appears ;)

      Ruby has no better answer (although reduced) and JS has many options for installing a node runtime of a particular version and managing libraries. It’s because of a hidden cost. Interpreted languages ask you to set up a dev environment, pretend to be a developer every time. It’s why so many READMEs don’t even want to get into it. Don’t say docker. :) You could say pypy. :) But then that’s not a config file? I liked HCL the last time I used it as a library in a program.

    12. 3

      setuptools seems a weird example - the ini-style setup.cfg came long after the python code setup.py. It’s made the packaging tools a lot easier to use. It also doesn’t let people write complex code in the file; and avoids a lot of problems caused by dynamic code running when reading the configuration.

      One of the really big advantages of “plain configs” is that they generally mean the same thing on every machine they run on; and can be easily moved around without loss of meaning (usually). With my devops hat on, programming languages as config files are really painful - you can’t validate them without running all or part of the program, it’s difficult and error prone to write them out from another program (having to use a templating language on top of either data-serialisation languages or programming languages often means no validation is done at all).

      Programs-as-configuration makes a lot of things easier, but the main thing I care about in an approach to configuration is how easy it is to get wrong and cause problems.

    13. 3

      I am surprised there is no mention of 9P (from Plan 9), a protocol like that allows you to easily have a programmable config which is language agnostic.

      1. 1

        Educating myself about Plan 9 (and related stuff) is something on my todo list :) Anything particular you recommend to read? The wikipedia article about 9P doesn’t make it too clear.

    14. 3

      Every time I’ve used something smart it bit me in some way. Usually not from configs that I wrote, just trying to understand them.

      I’ve come back to be 100% in support of TOML and good ol’ INI. It works 100% of the time until it doesn’t scale anymore, and in the rare cases where you need something more powerful, fine.

    15. 2

      My broad thoughts on this post is “I agree; this is preferable approach”. That said, I have two additional thoughts:

      • My employer’s cdk project is a great example of this philosophy. A user writes TypeScript/Python/etc which get compiled down to JSON that CloudFormation can execute. It’s probably my favorite product that AWS has come out with in ages because it’s so nice to use.
      • In the case of GitHub Actions, https://github.com/actions/toolkit seems like it could reduce the need to write the YAML mentioned in the post.
      • I wonder how many network services really need a configuration file. How many of the values provided by a configuration file can be inferred at runtime, be determined through defaults, or be provided via CLI arguments? Moving all constants into a configuration file, like I’ve previously seen be done, seems wrong to me.
      1. 1

        It depends upon the network service I think. I know I have one at work that only accepts command line arguments since there’s really only three options to configure—the IP address, the port and a reporting interval. But other services might not be so simple. Yes, a simple gopher server configuration might be able to get by with some hard coded values, or even via the command line, but a much more complex gopher installation would be silly to configure completely via command line arguments.

    16. 2

      I deployed Python “configs” at work to great success. Contenders were things like XML, JSON, TOML, YAML, etc. The Python config files seamlessly integrate into the rest of the Python software and have been a source of great productivity and stability. Initially, it was a quick and dirty implementation to avoid defining and parsing config files, but it turned out that in this instance, keeping it all in Python was superior in every way. The configs I’m talking about are very complicated and benefit from being implemented in a general-purpose language. I dynamically load a .py file whose name is given by input parameters to the program. This .py file needs to implement a function that fills certain objects with configuration values. This function is then called to populate the configuration. It’s a very simple design, but effective.

    17. 2

      I’ve been using python stack on my desktop which is all python configurable – it has been amazing experience! Qutebrowser for web browsing - python configs and callable scripts.
      Qtile for window manager - super-powered python configs. Xonsh for shell – whole shell config in Python including aliases etc. Allows for some really complex stuff to happen in shell very accessibly.
      Ranger – file browser with every config being python.

      Python is brilliant for configs the only down side is that exceptions will break everything where sometimes it would be preferable if they passed silently so I often have to include short wrapper func for some exception prone calls.

      Kitty terminal emulators while being mostly python doesn’t use a py file as a config and it really sucks.

    18. 2

      In my experience:

      • you absolutely need a Turing-complete programming language, so you can manage and abstract the complexity of all the system components
      • there should be only one programming language in the stack, because it’s a lot easier to write a program to generate JSON or INI or TOML than it is to generate code in a different programming language
      • eventually, a standalone system with Turing-complete orchestration becomes just a single component of a larger system, and suddenly all the benefits of programmatic orchestration become drawbacks, as the meta-system needs to orchestrate this one

      This is why dedicated config-generator systems like Dhall, Jsonnet and Cue are so interesting to me: you get all the advantage of programmatic orchestration, but when your system becomes absorbed into a meta-system, all the actual components of the system are using simple, generatable config files instead of each one having its own interpreter.

    19. 2

      I will agree that YAML sucks for config. Not because it doesn’t do enough, but because it can do too much, and because significant whitespace is the bane of my existence.

      If you need logic in your config, I’ll suggest that you’re putting the logic in the wrong place.

    20. 1

      I commented there, but suffice to say, this was examined in what ISTM was a more elegant and comprehensive way by Steve Yegge in 2005:


    21. 1

      I see you haven’t brought up S-Expressions or lisps in general in the post, any reason why?

      1. 2

        Mostly because I haven’t seen any lisp-inspired configuration files apart from Emacs?

        I agree it can be powerful and pretty though. But there aren’t any widespread enough lisps to rely on (again, Elisp is probably the most likely one to be installed on user’s computer).

        1. 2

          Guix has done it with Guile.

    22. 1
    23. 1

      Nice idea, but I’d prefer my configs to not have an interpreter startup penalty when parsing them.

      And if I need to recompile something so that I can read configs, or they’re starting to get so complicated that I need a language, I personally think that something more fundamental needs to be changed.

      1. 2

        Sure. I agree that depends on the tool in question. If it’s something like ripgrep, you really don’t want it to have this overhead. If it’s a heavy program that’s going to be running for minutes or hours, then it’s just tens of milliseconds of overhead.

    24. 1

      I’ll have Python in mind here, but the same idea can be applied to any dynamic enough language (i.e. Javascript/Ruby/etc).

      However, you can achieve a similarly pleasant experience in most modern programming languages (provided they are dynamic enough).

      I have no idea what you mean by “dynamic” in this context. I might guess you mean dynamically typed, but I don’t think that makes sense in this context. Notably, XMonad and various similar projects are configured using Haskell, which is statically typed (Though as with many projects using a ‘real’ programming language for configuration, there’s a blurring of the line between a user writing a config file for the program to load, versus a user writing a program using a library or framework).

    25. 1

      The downside of using “real” (Turing-complete) programming languages for short-running processes configuration is the complexity of surgical changes, i.e. automatic migrations to new versions.

      For a long-running processes, I think a better way is to treat configs as a persistent state and provide configuration API over system bus/HTTP/etc. Programs can keep config structure and comments intact so that users are able to manually write and edit the file. But changing something at runtime as part of some complex logic of another process won’t necessarily require a restart.

    26. 1

      Ruby is quite good at implementing DSLs, and Ruby-based tools such as Danger or Cocoapods naturally use Ruby-based DSLs for their config files. This also carries the advantage that you get basic syntax highlighting in any editor made in the last 20 years.

    27. 1

      imapfilter is more a C implementation of IMAP bound as a lua library than a software configured in lua. A good example of software as a high-level language library as the author shows off.

      opensmtpd is doing what the configuariton languages tells it to: if it is empty, nothing is done. Much like an imperative configuration language, with mapping between actions, events, and listeners feeding the events… Yet no conditions. The last mile of the program is written it its configuration file which only containt mappings between input (connexions) and outputs (mailboxes).

      exim is having a complex, borderline cryptic configuration language. It lets the user choose what to do with the events at run time, but in a hard-to-read-and-write way. Restructuring the whole thing as a python, lua, ruby, whatnot class would make it much easier.

      => Append code until what is left belongs to the user. If no run-time evaluation of the configuration is needed .json/.toml/.ini/.hcl/csv is all I need. If there is evaluation at run-time, maybe you want to write a library with high-level language bindings.

    28. 1

      Another data point of using this successfully: SportStats v2., written about here https://link.springer.com/chapter/10.1007%2F978-1-4614-9299-3_11

      Two aspects of note:

      1. As described in The Configuration Complexity Clock, sooner or later you want variants, composition and other abstraction capabilities.

      2. Configuration files are usually a way for changing aspects of the systems more cheaply/quickly without having to touch the code. But in our case it was going to be the devs. who would have had to manage to configs, so just as much or little effort as a config file, and a config change was just as costly (redeploy) as a code change. So there were no real benefits, but significant costs.

    29. 1

      Maybe I am alone with that thinking, but I think we’ve gone the wrong way about configuration. JSON obviously wasn’t made for configuration, but to exchange data. (no comments, etc.)

      YAML I think is better in that regard.

      Don’t get me wrong, I do like both of them. JSON is great and I think in many scenarios it’s replaced way too easily with other formats, when it doesn’t matter. For example when uncompressed text is exchanged over HTTP and gzip compression is used anyway, why bother stripping one or two bytes off at the cost of not being portable? I’ve seen that way too many times. But going off topic here.

      All of UCL, HCL and TOML feel so much better for humans, and all have converters JSON-representation (or converters) for whenever machines deal with (generate for example) it.

    30. 1

      my favorite:

      (define (get-config file)
        (call-with-input-file file
          (lambda (p)
            (eval (read p) (environment '(scheme base))))))

      The advantage is that whatever in the config file is only eval-ed under the readonly environment, so it will not interfere with the running process, while the config file can do everything that scheme has to offer. It’s not for untrusted config files, but to prevent the programmer from shooting itself in the foot scenario.