Threads for DomBlack

  1. 4
    func FormatID(prefix string, id xid.ID) string {
        return fmt.Sprintf("%s_%s", prefix, id)
    }
    
    type AppID struct { xid.ID }
    
    func (a AppID) String() string {
        return FormatID("app", a.ID)
    }
    
    type TraceID struct { xid.ID }
    
    func (t TraceID) String() string  {
        return FormatID("trace", t.ID)
    }
    
    1. 2

      True, there are other ways to do this, but we have 22 other methods on the type for various interfaces, which would result in a lot of boilerplate for us to implement for every resource type we have. We could have code generated it, but generics gave us the code generation for free.

      1. 2

        Are you generating the resource types?

        1. 1

          Yes we do generate some stuff from the resource types, for instance the SQL migrations I mentioned in the blog.

          As I mentioned in the blog I think code generation and generics are not mutually exclusive concepts and can be used together to create better outcomes for developers

    1. 3

      I frequently see “newtypes” (in Haskell-speak) and sometimes phantom type parameters for IDs, but it’s really not clear to me how much benefit this approach produces over a type alias. That is, for the classes of bugs that type ID string catches, but type ID = string would not, how often does those actually occur? My experience is: Not often.

      1. 1

        A type alias of type ID = string would make all different types of identifiers equivalent from the type checker’s perspective. Since identifiers are so pervasive in backend development this was the main class of bugs we wanted to prevent, as it’s quite easy to call a function with the wrong type of identifier (especially after refactoring). But YMMV of course.

        1. 3

          My experience is that this class of bugs is relatively uncommon and easy to catch/fix.

          Moreover, names seem to provide better bang-for-buck in both solving this problem and other readability improvements. In go, this means using structs instead of positional arguments.

          For example, instead of:

          func AttachFile(db DB, emailID string, fileID string) (attachmentID string, err error) {
            // ...
          }
          

          Do:

          type AttachFile struct {
            EmailID string
            FileID string
          }
          
          func (args AttachFile) Do(db DB) (attachmentID string, err error) {
            // ...
          }
          

          Now it’s extremely unlikely you’ll accidentally swap the IDs because the call site will be much clearer. As a bonus, this makes it easier to address a laundry list of other concerns, such as logging, authorization, validation, etc.

          1. 3

            The fun thing is that if you squint, types are names that are checked by the computer. Granted, there are always trade offs involved, but for me, it makes it that little bit easier to make the integration reflect the problem I’m solving. And the error proofing is a bonus, too.

            1. 2

              Fun fact; Go actually calls “type Foo string” a NamedType in the spe

          2. 2

            I avoid this in my codebase by making all the sequences (in test mode) increment by 200 per ID, and starting them at different offsets for each table (fewer than 200 tables). This ensures that no identifier can match a row in another table.

        1. 1

          A stated reason to avoid KSUID was that it uses base-62, so comparison would fail depending on the sorting preference. As base-62 is just the representation, could KSUID still be used with a different encoding, for example base-32?

          1. 1

            Absolutely none at all, we could have added our own encoding methods for handling thus quirk.

            One reason for using an existing scheme for the basis of our ID’s was to get a battle tested system, which reduces the risk of us having bugs in that layer. I would have been concerned that our encoding and decoding might have included a werid edge case (although Fuzzing would have picked this up).

            The main reason we did not use it in the end was we wanted in process ordering. A lot of things can happen in the same second; and if we ever had a race condition type bug, being able to extract the exact order in which objects where created would be incredibly useful

          1. 1

            Shouldn’t type ID[T ResourceType] xid.ID be type ID[T ~ResourceType] xid.ID instead?

            Later in the article, ResourceType goes from concrete to an interface, which resolves the problem.

            1. 1

              I started skeptical of this, but it’s actually a pretty good use of generics. I think in terms of pure efficiency, if they had defined an ID as an array (not slice!) of 24 bytes with the first 4 as a type code, that would actually be smaller and faster than using two machine words for an interface, but this is still quite good.

              1. 2

                With monomorphization the approach with generics avoids the overhead from indirection through an interface, so the memory representation is just a [20]byte.

                1. 1

                  Actually the memory representation is [12]byte. As André mentioned the benefit of the generics means we don’t have to carry around the type information at runtime within the data itself.

                  However before generics encoding the first couple of bytes as a type identifier would have been a neat way of doing it and a natural extension to how XID encoders day within the IDs.

                  We would have still gone for human readability in wire formats of the resource type (I.e I’d like to see if an id is a trace, app, user, log line etc).

                  Another advantage to encoding it as a type rather than in the byte array is there’s not upper limit on how many types of resources we could create. (Although admittedly 4 bytes would give more than enough)