1. 29
  1. 6

    Very impressive! I particularly admire your Summary of what kinds of schema changes are safe, and how asymmetric fields (required for the writer, but optional for the reader) allow safe schema evolution.

    I would love to hear more about the inception of this project. What combination of research / colleagues / experiences / thoughts inspired you to build this? Did it require a lot of refining to put into practice?

    1. 4

      At the companies I’ve worked at, required fields were always treated with suspicion or banned completely. This meant you never knew which fields you had to set in order to use an API (at least not from the types), and the owner of the API could not rely on deserialized messages actually having their fields populated. There are analogous but lesser appreciated problems with enums and more generally with sum types: when pattern matching, you’re always supposed to have a fallback case for when the input isn’t recognized, even when it’s often not clear what to do in that situation. Everyone just gets used to this hand-waviness, and software quality suffers as a result. I started investigating why organizations are averse to having stronger type safety guarantees, and (after reading many internal debates at Google) I concluded it’s because the technology didn’t support them enough. So about a year ago I set out to rethink how we design and evolve APIs from first principles. The idea of asymmetric fields can be thought of as an application of Postel’s law to APIs. The concept isn’t new, but I think encoding it in the type system this way is. Perhaps one of the reasons why asymmetric fields weren’t invented sooner is because we don’t teach category theory to computer science students, which is a shame as I relied heavily on intuitions from category theory (especially duality) when designing Typical.

    2. 3

      I like the idea with asymetric fields.

      1. 2

        Nice library, much more friendly and approachable than the last IDL I came across focused on union data types.

        Unlike with most programming languages, comments in Typical schemas are associated with specific items.

        This is Good, and one of my favorite things about working with Golang’s parser.

        In most cases, Typical’s encoding scheme is more compact than that of Protocol Buffers and Apache Thrift thanks to smaller field headers, a more efficient variable-width integer encoding, and a trick that allows some information to be inferred from the size of a field rather than being encoded explicitly.

        I see that you’ve optimized primarily for compact encoding like Thrift rather than going somewhere like Flatbuffers or CapnProto which are zero-copy/no decoding. What was the motivation for this approach? Is message size the most important issue in your systems? It seems like it’s possible to do a partial decode of a message; does the generated library provide any facility for doing so?

        My application [https://notion.so] does a lot of cache-filling from the network (both the back end and front end), but may not need to actually access every field of every message. I figure many applications are similar; so I find lazy decoding or zero-copy decoding appealing.

        An asymmetric field in a struct is considered required for the writer, but optional for the reader. Unlike optional fields, an asymmetric field can safely be promoted to required and vice versa.

        Is it possible to write unit tests for asymmetric fields that can produce the reader version of the struct with the field missing?

        1. 2

          I see that you’ve optimized primarily for compact encoding like Thrift rather than going somewhere like Flatbuffers or CapnProto which are zero-copy/no decoding. What was the motivation for this approach? Is message size the most important issue in your systems? It seems like it’s possible to do a partial decode of a message; does the generated library provide any facility for doing so?

          Although RPCs are the obvious use case for Typical, I wanted to support certain storage use cases too. I’m inspired by the many ways Google uses Protocol Buffers beyond just serializing API requests/responses. While I know of many problems with Protocol Buffers, I can’t deny the convenience of their ubiquity at Google. I wanted Typical to be no worse than Protocol Buffers (or Thrift) in terms of space so it could be competitive in that respect. However, this is a compromise since there is an explicit encoding/decoding step, as you mentioned.

          In principle, Typical could also support zero-copy deserialization (for strings/blobs), and that’s something I hope to add someday. However, it’s worth noting that Typical will always do UTF-8 validation of strings on incoming messages, which means there will still be an O(n) pass over string data even if nothing is copied. Other zero-copy frameworks tend to delay such decoding errors until you try to read the field, which results in runtime errors in your program in unexpected places (or even worse: invalid string data silently flowing through your program!). Typical is stricter about where errors are allowed to surface.

          Edit: I forgot to answer your last question. No, the generated code does not currently provide a way to partially decode a message.

          Is it possible to write unit tests for asymmetric fields that can produce the reader version of the struct with the field missing?

          That’s a really good and important question! It’s possible to instantiate the “In” versions of structs directly (with optional/asymmetric fields missing), which can be used to test code paths that consume such messages. What’s more difficult, however, is constructing a serialized version of a such a message (which would require using the “Out” version of the type), so that you can include the deserialization logic in the code being tested. However, doing so is (arguably) of little value, since the deserialization logic is tested extensively by Typical’s own tests. So my recommendation would be to structure your code so that the business logic can be tested independently of message decoding, but that advice only applies to unit tests.

        2. 2

          I’ve done something similar to asymmetric fields for API schema migrations using OpenAPI (aka Swagger). In my case, the backend is written in Kotlin and is the source of truth for the API definition. I generate the schema from annotations on request handler methods and payload classes using Springdoc-OpenAPI. The schema is then used to generate client code.

          The combination of annotations and nullable types lets me represent what Typical refers to as asymmetric fields:

          data class MyRequestPayload(
            val required: String,
            val optional: String?,
            @Schema(required = true)
            val asymmetric: String?
          )
          

          The schema generator, by default, will mark fields with non-nullable types as required in the API, and nullable-typed fields as optional. But you can override that with an annotation as I did in the third field above. You end up with a field that is marked as required in the schema (and thus is a required argument in the client code that’s generated from the schema) but the type system still requires you to deal with missing values on the server side.

          This approach has worked out pretty well for adding and removing required fields.

          1. 1

            Very clever!