1. 6
  1.  

  2. 6

    Doesn’t say anything about storage and performance costs. Indexing on enums or foreign keys produces smaller (hence faster) indexes, than indexing long-ish text fields, even if there are words like “comedy” or “waiting”, “active”, “old”, etc.

    1. 3

      Under “Cons” for the foreign key approach:

      Additional tables to maintain — which is a SIGNIFICANT cost

      Let’s imagine adding a new table just to store the list of genders

      I don’t see why creating and populating a reference table with a fixed set of values is significantly more costly to maintain than adding an enum type. Yes, it’s a CREATE TABLE and an INSERT rather than a single CREATE TYPE, but then… you’re done. One additional SQL statement in a migration script doesn’t seem like a significant maintenance burden to me.

      My hunch is that the author was thinking about the “Foreign keys mean users can maintain the list of values” aspect and the significant cost would mostly be adding user-facing functionality to edit the table. But the cost there is in the additional functional requirement (user-maintainable lists of values), not the fact that you’re using a table instead of a type. If you had the same functional requirement with an enum-based implementation, it would be even more costly.

      1. 1

        Tables are also less efficient — accessing a table, even by an efficient primary key type, costs IO in a way an enum or function call does not. And, so does looking up and enforcing foreign keys. As your data grows, foreign keys are the first thing to go — it’s usually not worth the read amplification. But still, I think using a table for this is worth it if you want user-configurable enums and you’re not expecting huge amounts of data.

        1. 3

          In the case of small tables, it’s going to be in the page cache essentially every time. It would be a rare configuration that actually required a syscall, let alone a disk read.

      2. 2

        This is reinforcing my belief that SQL’s type system is really anemic. RDBMSes deserve better.

        1. 1

          Can you give an example what should be possible with better type systems?

          1. 2

            I think it should be possible to define types/“domains” across a DB, and have columns be able to use those. You could even update the types immediately instead of having to deal with ad-hoc constraints or enums. I believe Codd proposed this in RMv2.

            I have a draft sitting around for how one can integrate some PLT ideas into an RDBMS query language - stay tuned…

            1. 1

              Not sure I understand how updating a type is more immediate than adding another value to enum.

              I have a draft sitting around for how one can integrate some PLT ideas into an RDBMS query language - stay tuned…

              That would be really interesting to read! SQL has its warts, and don’t get me started on Elasticsearch or Mongo query “languages”.

              1. 1

                You could update it across all usages. Imagine if say, you had an ML like “you had to handle it” across your UDFs as well, and a migration would only complete as an atomic transaction if all UDFs operating on the function were changed to handle the new type definition.

                1. 1

                  I see, like in Rust, where we have to handle every possible scenario when pattern matching.