One of the downsides—potentially—is that each ULID leaks the information of when it was created. I mean, this is obviously one of the big selling points of ULIDs, but there might be situations where it’s undesirable.
I came here to say this… The main reason to use random UUIDs is that they are meaningless identifiers and they do not reveal anything else (increment counts, creation dates, node where entity was created, whether one entity was created before/after the other one, entity types, categories, tags or other metadata).
There might be a reason to use ULID sometimes, but when designing a system and identifier scheme, I would be rather a bit paranoid by default and do not encode any other data into identifier unless I have a really good reason to do so.
Nit: only UUIDv4 is random. Versions 1 and 2 definitely leak information, since they encode a timestamp and a MAC address of the node generating the id.
Would you choose UUID v6/v7 if they were more stable and prevalent do you think? Or does the base32 encoding of ULIDs play a siginifics role in preferring them?
For anyone who’s wondering what makes them not UUIDv4s, I guess it’s because the the first part, 01859DB9-, does not look very random. A UUIDv4 however is completely random, like:
$ for x in $(seq 5); do uuidgen; done
F61DCF92-863B-4250-98EA-16E417311539
F6411104-AA7F-47D9-B4F2-D6A2B96834B5
A5D25324-E8B6-478F-B626-28ACC7683988
27A8CB5A-2307-467E-A076-8EF5D7C05ED8
D1F034B1-C2B2-43C7-B4C1-DE7550E86FEF
One downside of ULID and UUID v7 is that you also can get bottlenecked in the database, because multiple entries may be right beside each other (many entries with at the same time). Fully random IDs may be on very different memory pages (or the disk backed equivalent), so the concurrency can be better.
Still I personally prefer v7, because for my workload the random memory layout of UUID v4 would slow me down.
One of the downsides—potentially—is that each ULID leaks the information of when it was created. I mean, this is obviously one of the big selling points of ULIDs, but there might be situations where it’s undesirable.
Good point. I added it to my list of downsides. Thanks!
I came here to say this… The main reason to use random UUIDs is that they are meaningless identifiers and they do not reveal anything else (increment counts, creation dates, node where entity was created, whether one entity was created before/after the other one, entity types, categories, tags or other metadata).
There might be a reason to use ULID sometimes, but when designing a system and identifier scheme, I would be rather a bit paranoid by default and do not encode any other data into identifier unless I have a really good reason to do so.
Nit: only UUIDv4 is random. Versions 1 and 2 definitely leak information, since they encode a timestamp and a MAC address of the node generating the id.
Would you choose UUID v6/v7 if they were more stable and prevalent do you think? Or does the base32 encoding of ULIDs play a siginifics role in preferring them?
Yeah, maybe, although I do think you still need a better encoding for URLs and such.
Well, canonical hex encoding isn’t enforced in any way. You can use whatever encoding format you want.
Creator of https://www.ulidtools.com here - nice site! ULIDS ftw.
Are you sure that these strings are UUIDv4?
For anyone who’s wondering what makes them not UUIDv4s, I guess it’s because the the first part,
01859DB9-
, does not look very random. A UUIDv4 however is completely random, like:This issue is that the version/variant aren’t correct for UUIDv4. You can see when you decode them: https://www.uuidtools.com/decode
Thanks! Didn’t knew that the version is encoded in the UUID.
Wow, that’s great! Is it open source?
One downside of ULID and UUID v7 is that you also can get bottlenecked in the database, because multiple entries may be right beside each other (many entries with at the same time). Fully random IDs may be on very different memory pages (or the disk backed equivalent), so the concurrency can be better.
Still I personally prefer v7, because for my workload the random memory layout of UUID v4 would slow me down.
For anyone curious, this percona article discusses the performance implications of using UUIDs as row keys.
Their performance benchmark is definitely something
Two other approaches I’ve considered in the past are XID and CUID - I eventually settled on XID for a project.