1. 21
    1. 17

      Doesn’t supporting interpolation mean that an entire JavaScript engine is required?

      1. 5

        Yes, but we only need an engine that implements ES5 or ES6 (2015) at most, of which there are plenty of embeddable, light weight options without requiring full-blown V8 or Node. For example, QuickJS https://bellard.org/quickjs/ or GoJa https://github.com/dop251/goja (which we’re using for our go implementation).

        I’ll add that these engines are embeddable, so they would be part of the TySON library, and thus don’t require any external dependencies.

        1. 16

          You’re seriously embedding an entire JavaScript interpreter in your configuration language? I’m sorry, but that’s ludicrous.

          The rest of this format is basically JSON5, which I’ve been using since 2015.

          1. 4

            In fairness, the image format SVG almost got raw socket access :D

          2. 2

            You’re seriously embedding an entire JavaScript interpreter in your configuration language? I’m sorry, but that’s ludicrous.

            Is it? I tend to favour libucl for configuration files (it has a bunch of nice features for humans, such as include files with rules for composition, and I’ve written some tooling that lets me define a JSON Schema for my configuration and then have a nice C++ wrapper that exposes it) but the library is larger than either jsQuick or DukTape. If you have a configuration that is sufficiently complex that people might want to use code with it, embedding a JavaScript interpreter might not be such a bad idea.

            1. 3

              For sanity’s sake there needs to be a hard line between config languages and programming languages; that line is probably Turing-completeness. The attack surface of the latter is so much bigger … not just sandbox escapes but infinite loops, stack overflows, memory exhaustion, nondeterministic behavior.

              I’d be ok with a config language that allowed simple expression evaluation, and interpolating expressions in strings; that’d be useful.

              Beyond that, what you are creating is a program, and you should write it in something designed as a full PL, not a config language that’s metastized. (CMake is my poster child for this disease.) Lua was literally created for this purpose.

              1. 1

                I think it depends a lot on the kind of configuration. If you’re setting some options, you have one set of requirements. If your configuration is really a DSL for defining how a complex set of components are assembled, then you have very different requirements.

                I’m less convinced by the security argument because I rarely think of configuration state as untrusted. I assume an attacker who can write to a configuration file for a program is allowed to do anything that that program is allowed to do. There are some cases where this is not true (for example, per-user configurations for a shared server) but that’s often better handled by dropping privileges to match those of the user than by assuming that the config file parser is bug free.

                Beyond that, what you are creating is a program, and you should write it in something designed as a full PL, not a config language that’s metastized. (CMake is my poster child for this disease.) Lua was literally created for this purpose.

                I don’t see that Lua is better than TypeScript here. A TypeScript EDSL that gives typed definitions of the objects that your program is supposed to generate seems better than ‘please give me some Lua tables and we’ll check when we run your program whether you got the structures right’ as an approach.

          3. 1

            From my point of view whether the tradeoff is worth it depends on whether you want a programmable configuration language (with functions and imports) or not.

            If you have an application for which you don’t need programmability, then JSON5 is great and TySON is not for you. In fact, I would encourage you to use JSON5 for those cases. However, if you need programmability then we think TySON is a good tradeoff and the embedded JS interpreter is very small. As a point of comparison, consider languages like dhall, nickel, jsonnet and cue all of which include their own custom interpreters for their own custom languages.

            1. 4

              Why not just use JS as the config language then?

    2. 11

      So you saw the abomination that is YAML and said “hold my beer”?

    3. 10

      Thanks for posting! This is interesting.

      I would like to see mention of JSON5 which is 11 years its elder. For comments in JSON, JSON5 is a good starting point.

      I see the need for a more strict JSON. Types and comments are just the tip of the iceberg. Joe Tsai has compiled a great list of other JSON related issues (this is the context of the Go library, but it’s a great resource).

      One of the larger issues I’ve run into is duplicates. Douglas Crockford, JSON’s inventor, tried to fix the duplicate issue but it was decided it was too late. Although Douglas Crockford couldn’t change the spec forcing all implementations to error on duplicate, his Java JSON implementation errors on duplicates. Others use last-value-wins, support duplicate keys, or other non-standard behavior. The JSON RFC states that implementations should not allow duplicate keys, notes the varying behavior of existing implementations, and states that when names are not unique, “the behavior of software that receives such an object is unpredictable.” Duplicate fields are a security issue, a source of bugs, and a surprising behavior to users. See the article, “An Exploration of JSON Interoperability Vulnerabilities” Disallowing duplicates conforms to the small I-JSON RFC, which is a stricter JSON. The author of I-JSON, Tim Bray, is also the author of JSON RFC 8259. See also the JSON5 duplicate issue.

      1. 8

        Good call out. I’ll add a mention to JSON5.

        In fact, for one of my use cases I considered using JSON5. I really liked it, except for the fact that I really wanted multi-line strings and string interpolation. Template literals weren’t introduced until ES6, so the JSON5 spec rejected the idea of of including them in JSON5 (since JSON5 must be compatible w/ ES5) – it would be possible to add those in a future JSON6 standard (but alas, I couldn’t find any go implementations, which is what I needed)

        For a moment I considered implementing a JSON6 in go … but discovered that it would be easier / faster to use an existing TypeScript bundler + an embeddable JS engine, and I ended up with TySON.

    4. 7

      I feel like this is really misguided. What exactly is this other than evaluating typescript? JSON is not simply ’evaluate JavaScript ’ it is much less than that and much less ambiguous for good reasons.

      It has always been possible to just evaluate JavaScript, or any other language for that matter. The sucess and value of JSON lie on it being less rather than more. To solidify the feature set in something tiny to specify technically while being extremely versatile. It doesn’t support comments not complicated quoting and line breaking rules dor this reason.

      Benefits of using TySON

      Over what? For what purpose? and how?

      1. 1

        JSON is not simply ’evaluate JavaScript ’

        Originally it was! Validation without parsing is why it has so many obnoxious limitations :)

        1. 1

          Originally it was!

          AFAIK, it was created precisely because people were starting to just evaluate a JavaScript object retrieved remotely. Thereson JSON exists is because doing that was and is an horrible idea.

    5. 6

      Neat. We use Typescript for all the config we can at Notion - and we do things like spit out CircleCI config from a well-typed typescript file. It’s kinda cool you’re exposing this pattern without the consumer needing Node by embedding QuickJS/similar. That said some of the goodness of TS-as-config comes from seamless interpretation of the config types and the program’s larger type system. I would hazard a guess that Rust people would rather config-in-rust-alike-via-maco-magic than use this but what do I know. I’m a TS guy.

      You should be more clear about to what extent this is a “subset” of typescript or if it’s all of Typescript. From the README it sounds like it could be Python:Skylark::Typescript:TySON but — you allow all of typescript?

      1. 2

        It’s limited to TypeScript features that can be transpiled to ES5/ES6 (that’s what lets us use an embeddable JS engine). I’ll add more documentation on which features are support and which are not, but to give you a concrete example await/async is not supported.

    6. 5

      Why not simply use Lua tables? The Lua interpreter certainly has a smaller footprint than a JS interpreter.

      1. 2

        Lua is a good choice for similar use cases. We wanted TypeScript because it’s significantly more popular as a programming language. One way to think of TySON, is that it’s trying to enable Lua-like use cases but for TypeScript.

    7. 3

      I hate this and I love this, but I don’t know why for either of them. This is a fantastic creation. Keep making blursed things like this.

    8. 3

      If you wanted types + functions + variables, Dhall & Nickel could suit your needs without requiring a JS engine. They are both mentioned in the docs but dismissed with “you would have to learn a new syntax”. They are both ML-inspired languages so nothing felt ‘difficult’ or ‘tricky’ about learning either. I learned a bit of TypeScript & that was a bad experience. I mean look at how much more syntax barf is required to express Maybe or Option in language in the ML lineage vs. TypeScript–1 line of PureScript for 17 in TypeScript (13 if we remove dangly } brackets).

    9. 2

      I’ve been using typescript as a config language for a few years now and it’s worked great… with the caveat that I only do this from within typescript projects. :)

      It’s relatively straightforward: I define a Config interface, then at server startup, run the (embedded) typescript compiler on the desired config file (with caching – typescript’s compiler is slow), and import() it as if it were a module. You can see an example of the config file loader here and a config interface file here… At past startups, we’ve done the same with java-derived languages.

      That said, I’m not sure I’m into the idea of a data interchange format that supports arbitrary JS expression interpolation, and a JS-based config file would feel extremely heavyweight in my rust projects.

    10. 2

      The most interesting feature of this is the types. Typed configuration languages are few and far between (dhall, nickel, maybe Nix counts, what else?), and Typescript’s type system is very powerful. But it’s also just… too powerful, I have no idea how you’d implement this without an entire JS and TS engine. The type system alone is Turing-complete!

      Defining a further subset of this (maybe with more limited generics) that could be reliably implemented without a full JS engine could be useful.

    11. 2

      Congrats on shipping something, that right there is better than most.

      I didn’t see any discussion or comparison between the various other json extension languages, such as Dhall, just referencing them at the end. What makes a Turing complete language moree useful here than a total language like Dhall?

      1. 4

        My view is that it’s better to pick a “good” language that is widely adopted, has well known syntax and a thriving ecosystem, than it is to pick a “perfect” language for which there’s a bigger adoption curve because the syntax needs to be learned and the ecosystem needs to be developed.

        If you (or the users of your software) are willing to learn Dhall and invest in it’s ecosystem, than I think Dhall is a perfectly valid choice.

        FWIW, I don’t think the “total” vs “turing complete” distinction matters in practice. While Dhall programs are guaranteed to terminate, you could write a program that takes a very very long time to compute. In practice people don’t, by using good software engineering practices; but those are the same practices that let you develop using a turing complete language day-to-day.

        1. 1

          That makes sense. Thanks for the response!

    12. 1

      Pretty cool. One thing I think is also will be cool: replace json schema with typescript.

      1. 2

        Yeah, agreed. I’m thinking that users should be able to write types using Typescript, and we can convert those to a JSON schema if people want to export the types.

        1. 1

          This is impossible unfortunately. TypeScript types and json-schema do not map into each other.

          1. 3

            There’s an overlapping subset. I wrote a compiler that does this and it seems to work fine.

            The newest version is closed source but here’s my work so far - compiles Typescript types to TypeBox/JSONSchema: https://github.com/justjake/ts-simple-type

      2. 2

        I think JSON Schema is still a better solution for cross-language and tooling support, as there are many implementations ie allowing generation of Go and Java code, which would be harder if we only had TySon

        1. 2

          True. But typescript is such powerful language.

    13. 1

      This is pretty much just json with a schema attached. They only difference in the data format itself is including some newer JS syntaxes.

      The reason JSON has the excruciating limits it has basically boil down to crockford wanting to use eval to parse it, and that meant it need to be validated, which he wanted to be done with just a reflex. More or less every annoyance in json boils down to that. It’s was such a major mechanism of data transfer that in JSC I added a JSON preflight step when parsing JS. It had to support a few idiosyncrasies (variations on jsonp, parens for eval, etc), and was a massive PLT win on the majority of sites of the era. I’m not sure how much of a gain it is nowadays (I think XHR just directly supports json now, and CORS handles the jsonp cases).

      Edit: huh, I missed that it actually evaluates the interpolation strings. That’s a clear no go for a data interchange format. You can’t have “parsing data” include “execute code”, that’s unsound - and in the days of json-via-eval was a recurrent source of data compromises.

      1. 3

        I’ll add that this is not meant as a data interchange format. It’s meant as a programmable configuration language for trusted use cases.

        For data interchange, say, on an API, JSON or ProtoBuf would still be the right choice.

      2. 2

        This is more intended for trusted configuration files more than anything.