1. 25

  2. 8

    Sadly this article doesn’t mention that

    • there are in fact multiple versions of UUID types, you can distinguish by a version stored in the UUID.
    • one of them (V4) is completely random (so it’s not a misuse as stated in the article)
    • one of them is node-id (MAC) + time
    • one of them has a namespace
    • twitter uses snowflakes that contain time + node + sequence
    • discord uses something similar that also has worker IDs

    All of this to say that there is more than this one mentioned UUID and there are multiple kinds of solving said problem. Also storing the time may be a privacy problem.

    1. 1

      In the Twitter Snowflake repo, it says:

      There are many otherwise reasonable solutions to this problem that require 128bit numbers. For various reasons, we need to keep our ids under 64bits.

      Do you know some of these 128 bit solutions? Do they just mean a UUID?

      1. 1

        UUID uses 128 bits. See the previously linked wikipedia article, which also has statistics about collisions for randomly generated v4 UUIDs.

        A universally unique identifier (UUID) is a 128-bit label used for information in computer systems

        My guess is that for performance or backend reasons twitter wants to store it inside an i64 like a normal primary i64.

    2. 3

      I’ve been using KSUIDs in my latest project, and prefer them over UUID because they are naturally sortable by time in both string and binary representation.

      1. 1

        That’s like ulid.

        1. 1

          Well, there is no mention of whether ULID has been used in production by the OP, so I would say ULID is like KSUID, not the other way around.

          It seems that the author has independently arrived at the same idea as KSUID, and is either unaware or failed to credit the prior art.

          1. 1

            I think the author of this article is not the author of the ULID spec, but merely has pet projects pertaining to it?

            1. 1

              I haven’t been very successful at determining where ULID came from originally. I found this repo for a Ruby implementation that predates Segment’s blog post, but I don’t know how long they were using KSUID before publishing it.

              Two very similar ideas published around the same time, I guess.

        2. 1

          That KSUID was a fun read. I liked the history on Apollo computer.

        3. 2

          Using base 32 gives you 5 bits for each character. 128 isn’t divisible by 5 so there are actually a couple of bits spare that could be used for a checksum. Though if you need people to dictate the numbers, e.g over a phone or copy them down, it could be worth allocating a little bit more to a checksum.

          1. 1

            Why base62 instead of base64? It’s more expensive to convert, and there are few situations where you can’t have any non-alphanumeric characters.

            (Regular base64 has some problematic characters, but I like the variant that uses hyphen and underscore instead.)