1. 7
  1.  

  2. 3

    In my experience random (type 4) UUIDs are often used instead of sequential integers to prevent someone from iterating over all records of some API resource, e.g. if you provide some unauthenticated user information for frontends in your API from an endpoint like /users/{id} then–if user records have sequential ids–someone can just iterate over all users and collect this information. This is not possible when using random identifiers, but type 1 UUIDs as recommended in the article are not random and should be prone to this type of information disclosure attack.

    There’s a great article from percona that explains why random UUIDs are bad for performance. Instead of using UUIDs at all one can also try Universally Unique Lexicographically Sortable Identifiers (ULID) which may be better suited for the use as row indentifiers. Also, I did make the mistake of using type 4 UUIDs as primary key throughout all tables of a database design and it has shown that the performance is just fine, even for a couple million records. So, performance is really a problem of scale.

    1. 4

      For me, I try to avoid incrementing integers as much as possible for the simple reason that I want to be able to take the same data and load it into a different database and have no conflicts.

      Generally speaking, I want “natural keys” as much as possible. If there isn’t a natural key for something, then its by definition universally unique – and should have a UUID key. :)

      1. 1

        There’s a great article from percona that explains why random UUIDs are bad for performance. Instead of using UUIDs at all one can also try Universally Unique Lexicographically Sortable Identifiers (ULID) which may be better suited for the use as row identifiers.

        in their example they have 1 billion rows and benchmark things with thousands of insertions/second. That’s a rather specific use case most people don’t have; if you use an UUID for your user or whatnot then that’s probably just fine, and the performance difference is negligible. For example all of Lobsters could probably run on UUIDs with no real difference.

        1. 1

          Also, I did make the mistake of using type 4 UUIDs as primary key throughout all tables of a database design and it has shown that the performance is just fine, even for a couple million records. So, performance is really a problem of scale.

          I already stated in the sentences following your quote that performance problems are unlikely for “small to medium sized” tables.

          1. 2

            So, performance is really a problem of scale.

            I think I misread that a bit as “UUIDs aren’t webscale!!!” rather than “performance is only a problem at large scale, which probably won’t be an issue for you” 😅

      2. 1

        One thing that tripped me up with UUIDs in a database (postgres) was with two tables that used different UUID types… one of them had v1 UUIDs, the other v4 UUIDs.

        Then there was a part of the code that was sorting on the id column to get a deterministic sort order for server sided pagination… of course it seemed fine in the v1 UUID table because they have date time in them which provides some sort of a natural ordering, but it completely broke with the other v4 UUID table where they are random.

        Took a while to track down and realize they were different types…

        :nightmares: