1. 53
  1. 12

    To answer the first question I got about this is: The relation between Hay / Oil is analogous to (but IMO better than) the relation between:

    • Go Templates / YAML, used in Helm/Kubernetes. (I inadvertently contributed to this, which I might write a blog post about.)
    • CMake / Ninja
    • Autotools / Make

    These are all what I call “70’s style macro programming” – stringly-typed languages generating another stringly-typed language, with all the associated problems.

    In contrast, Hay and Oil are really the same language with the same syntax and the same Python- and JS-like dynamic types.

    (This is mentioned in the doc, but somewhat buried, so maybe the doc needs some revision)


    Also to repeat what’s mentioned in the doc: this is pretty early, so if you have a real system that can use this, we should talk :) This was prompted by Danilo Spinela playing with using Oil for a distro, but we need some more use cases.

    The bigger projects I tried prior to Oil (2010-2015) were a cluster manager and package manager, but I kept running into “language problems” (i.e. I didn’t want to use YAML to configure either one). Hay and Oil address these language problems.

    1. 4

      Excited to see this progressing. <3

      Sorry for being a little scarce. We’re in month 6 of a soul-crushing slog of a project (plus multiple leadership changes) and it’s left me feeling like a Looney-Toons-steamrolled version of myself. I’ve been hoping to find time/space to expand and understand Hay well enough to have helpful feedback, but it’s been out of reach. (And the floor has just dropped out from under us again; yesterday we discovered that someone was plagiarising and we’ll have to re-do a lot of work.)

      I’m not sure if I’ve mentioned https://ucg.marzhillstudios.com before, but I stumbled onto it last Summer. It struck me as fairly thoughtful and IIRC it is approaching some fraction of the same problems. The docker launch script tutorial (https://ucg.marzhillstudios.com/tutorials/script/) might be the literally-closest starting point? In lieu of big thoughts of my own for now, there may at least be some value in comparing/contrasting it with Hay to see if it’s solving for anything Hay isn’t yet? It may also make it easy to tease out some cases where being attached to a shell drives clear value?

      1. 2

        Ah no problem … hope things get better at work :-/

        Hm I haven’t seen UCG, but it is interesting, in that identifies the same problem:

        Templates can be difficult to manage without introducing hard to see errors in the serialization format they are generating. Most templating engines aren’t aware of the format they are templating. They usually end up being an ad-hoc programming language in their own right but without any way to enforce invariants or protect the template writer from creating bad configs. UCG attempts to solve this problem by giving you a real programming language that also generates the config format you need natively and safely.

        Same argument: Why are we templating YAML, not generating JSON?

        So UCG and Oil are aligned on that point.


        But there IS a big difference that I neglected to mention in this doc or on the blog! External vs. internal DSL.

        • UCG is “external” to a programming language.
        • Hay is internal because it’s embedded in Oil, which is a powerful language you can use for many other things.

        Concretely, I don’t think it’s a good idea to have a separate config language that reinvents functions, loops, conditionals, expressions, and needs a standard library. I feel like it creates a weird “mirror world” where you can have more code/logic in configuration than in the app.

        This is perhaps a language design version of The Configuration Complexity Clock, or at least a very related argument.

        Another thing I would say is that I’m not that fond of the “JSON with map/filter/lambda” design, which UCG seems to share, and I have seen a lot of. I think it scales poorly … maybe a bit hard to explain, but it partly comes from experience with BCL at Google, and there does seem to be a lot of this coming from Google employees:

        For example:


        Another thing I would say is that in Oil/Hay I would simply embed the Docker command line literally. There is no real reason to have the indirection from attributes to flags in many cases. If they are flags, then I would write them as flags.


        Thanks for the link – it does appear that there’s a lot I didn’t explain about it and this example is useful! I guess there are a surprising number of design dimensions and the world hasn’t converged on one :) I’m working on a wiki page to compare them, and this analysis will contribute to that.

    2. 10

      It’s probably worth adding comparisons to Cue and TOML for context.

      1. 7

        Yes I’m working on a wiki page, but this sibling comment has some of my thoughts:

        https://lobste.rs/s/phqsxk/hay_ain_t_yaml_custom_languages_for_unix#c_cy8j2p

        Briefly I’d say:

        • TOML is a data-only language. There are no functions / loops / conditionals. That is totally fine, but the minute you want to start “templating” it (like YAML/Go templates), I would say that is a smell. I mention in the doc that Hay is for the cases where you outgrow “plain old data” (which IMO happens to every system when it gets big enough).
        • Cue is one of the more interesting config languages (i.e. it is NOT the “JSON with lambda/map/filter” design I dislike). As far as I understand, it does validation with a logic programming model. I think this could be useful for some things, but I do think “regular Python-like code” is more general – I feel like you will have to mix Cue with something else for most apps ? But I’d definitely like to hear from people who have success with Cue.

        I wrote a comment about Cue at some point in the past, which I might dig up for the wiki page. I used BCL/GCL a lot which is mentioned on the Cue history page … i.e. both originated in Google’s Borg.

        There is actually a similar motivation for Hay. IMO Kubernetes is worse along many dimensions than Google’s Borg, which it was directly based on (kubelet is Borglet, etc.). And one of those dimensions is that YAML is worse than not only say Cue, but the predecessor GCL/BCL. So basically I think Kubernetes users are stuck with something worse than what we had ~15 years ago, which is perhaps why you see a lot of these kinds of projects from former Google employees (e.g. jsonnet too)

      2. 7

        What a fantastic surprise! I love to see innovations in this space of interleaving code and data in interesing new ways.

        You mention Nix, and Make’s support for data expressed in Guile, but the staged execution model of tasks reminds me specifically of Guix’s G-Expressions. I love the ease of templating (and of weaving through the layers of) nested G-Expressions, and ponder the (potentially unlikely and ill-advised) implications of nested Hay expressions, but appreciate that tasks can contain code or data to be consumed in any language. I think it would be great to see the output of myscript.oil under “Inline Hay Has No Restrictions”, and am looking forward to those examples of interop with eg. Hay Consumption in Python.

        1. 4

          Ah thanks for reminding me about G-expressions! That is indeed extremely related. I watched a talk about it a couple years ago. (although probably the major motivation for Hay was the build/service variant problem, which I saw with Starklark/BCL, explained a bit in the doc and in sibling comments .. )

          I’m working on a wiki page and will make sure to include G-expressions (although I haven’t used them).

          I’ve also been in touch with the author of the Gash shell (related to Guix), and it’s definitely interesting how they are using Guile for both the “config stage” and the “shell stage”.

          Maybe that is the ONLY other system that unifies them under the same language? Because Oil is using Hay for the “config stage” and Oil for the “shell stage” :-)


          Yeah for Inline Hay, what happens is:

          • Hay nodes are accumulated in the _hay() register, which you can later serialize to JSON
          • If you put echo hi in the middle, that’s allowed. It just goes to stdout. That is probably not very useful, which is why it’s not recommended :)

          I will try to clarify that in the doc..

          The Hay consumption in Python should just be something like

          p = subprocess.Popen(['myevaluator.oil', 'myconfig.hay'])
          # then run it and parse JSON from stdout
          # and then do whatever logic in Python
          

          So yes that needs to be filled in.

          Thanks for the feedback! Let me know if you have other questions/comments.

        2. 3

          Hay Ain’t YAML Ain’t Markup Language

          Doubly-recursive - beautiful.

          1. 3

            Haha yup, I was thinking of YAY as well, but that seemed to cheesy :-)

          2. 2

            This is really cool! Sorry if this is an obvious question, but how would I start using this for an existing project that already has a config format? Would I use Hay to be my templating system for some other string config language like TOML? Can I generate a JSON config using it? How? Thanks!

            1. 2

              The idea I have is that the app would only have a JSON parser, and it wouldn’t need a config parser like TOML.

              By using Hay, you’re providing the user with some “programmability”, e.g. expressing config variants naturally. For another example, I tried to factor logic out of my Github Actions config, though it’s still quite repetitive as you can see:

              https://github.com/oilshell/oil/blob/master/.github/workflows/all-builds.yml

              So if Github Actions had used Hay, then my config wouldn’t be so repetitive and ugly.

              I also think this simplifies the app. Because a lot of platforms have to provide a lot of bells and whistles, when the programmability could just be done on the user config side.


              I should probably draw a diagram, it would look like

              [ Oil Config Process with Hay Evaluation ] –> JSON –> [ Application Process with JSON parser ]

              A nice thing is that TOML also serializes to JSON, so in theory you could provide the user with the option of both. (Though you would need to massage the JSON in the app – they wouldn’t output identical JSON.)


              Hay is “against” templating :-) You’re not supposed to use it with something like Go templates. Instead the “variation” is configured with Python- and JS-like types, not with text.

              I linked this post in another comment here: Why are we templating YAML, not generating JSON?

              So Hay follows the same philosophy. It is closer to Starlark/Bazel and BCL and further from YAML/Go templates.

              I wrote this page this morning: https://github.com/oilshell/oil/wiki/Survey-of-Config-Languages

              and updated the doc a bit.


              Let me know if you have questions and feel free to post to https://oilshell.zulipchat.com/ (e.g. #oil-help or elsewhere)

              What kind of app are you considering it for? As mentioned it’s still early and it needs feedback, but I will fix bugs. One thing Hay users may miss is that Oil still needs Python-like functions: https://github.com/oilshell/oil/issues/1112

              Also thanks for sponsoring! I’m interested in what sort of PL project you have brewing :-)

              1. 1

                Thanks for all the detail! I think what I’m trying to figure out is the migration path for an existing project or system to move to Oil/Hay from what they’re doing today with YAML, TOML, etc. Many systems may use JSON as an internal/low-level representation, but still present external interfaces through another level of abstraction (such as these config formats). For example, how often does the documentation only mention YAML and nothing else? So it seems like having support for a “wrong way” of templating could help people use Hay now instead of having to wait for their next greenfield project. I hope that makes sense!

                1. 1

                  Hm I think if I saw how the existing app works I’d probably have a more concrete answer. What does the config look like?

                  Concretely both sourcehut and Github Actions only use YAML as their user interface, and I can imagine them both using Hay. Whether they will do that is a different story of course :-P

                  The way I would see a migration going is two different paths:

                  1. Templates/YAML -> Expanded YAML -> App
                  2. Hay/Oil -> JSON result of Hay Evaluation -> App

                  So if people really want to use templates, they can, in a separate code path. But I would not use templates with Hay and Oil! It would get ugly and kind of defeat the purpose. That would be like autoconf generating shell or Python.

                  It’s also true that Hay is not a superset of JSON; rather it evaluates to JSON. (YAML almost has that property; I read UCL does as well.)

                  So Hay/Oil are really so you can use one language instead of two. It takes the place of both a data-only language like YAML/TOML/JSON, and a template language with logic (to express variants).

                  Feel free to send me details here or on any other channel … it’s OK if it’s still early because Hay is still early too :)

              2. 1

                Also here is the more “executable” answer … But note this is the “Inline Hay” with no restrictions. I think platforms would want to use the “Separate file” with some restrictions, but I’m open to feedback on that.

                The difference is whether you can put shell commands in the middle of the config, which could be useful, but also breaks the “hermetic” property of configs.

                $ bin/oil -c 'hay define Rule; Rule foo { version = 42 }; json write (_hay())'
                {
                  "source": null,
                  "children": [
                    {
                      "type": "Rule",
                      "args": [
                        "foo"
                      ],
                      "children": [
                
                      ],
                      "attrs": {
                        "version": 42
                      }
                    }
                  ]
                }
                

                (this is just like the docs but I wrote it all on one line in the shell)

                And note you can’t generate arbitrary JSON; the output conforms to the schema in the doc. I guess this is because say a Go app will want to do that processing in Go, not necessarily in Oil. There could be ways to extend / relax this though, so again open to feedback.

              3. 2

                What do you think of Dhall? I know it’s not as nice when it exists outside of oil but it really covers a lot of the same space.

                1. 4

                  I responded here:

                  https://old.reddit.com/r/ProgrammingLanguages/comments/vki1z7/survey_of_config_languages/idr5xqm/

                  There are A LOT of languages in the same space :) https://github.com/oilshell/oil/wiki/Survey-of-Config-Languages

                  To me that looks like there are more languages similar to Dhall than similar to Hay / Oil! (external vs. internal DSL; expression-based vs. Python-like language)

                  1. 1

                    I suspected you had covered it! Thanks for the links.