1. 23

    null is not the issue” followed by replacing it with something that is part of the domain model which represents the fact that the value is not defined is pretty much what I would call “null is exactly the issue”.

    I don’t think anybody was arguing that null/Nothing/None represent inputs for which no meaningful operation is possible, these states are often unavoidable.

    1. 12

      The article describes a problem I have seen multiple times in real-world programming companies. I think people cargo-cult the idea that null is bad and decide to eliminate null without understanding why it’s bad, and end up reintroducing most of the problems they previously had with null.

      1. 6

        (I’m the author) Hi, that was exactly my point :-) people just read that null is bad and go on to even worse solutions (like the NOP). I’ve chosen to write examples in java because it’s the target audience. Optional has a lot of issues, but at least it’s a marker and it’s in the stdlib, so it’s a good first step for people new to this kind of things.

        1. 5

          Your article does not actually include samples or discussion of the Null Object Pattern. If you’re thinking of this code sample, this is the sentinel value antipattern:

          // Turbo-bad
          String notThere = ""; // or "N/A", etc
          String there = "my string";
          
          1. 1

            You and I and probably everyone in this thread know that, but there are a substantial number of people who would call that “the null object pattern”, and there’s not (to the best of my knowledge) a well-known definition or blog post that one can link to to clearly explain the distinction.

            1. 1

              I didn’t include a complete NOP example to keep things short. I’ve put a link to the wikipedia page for those interested. Regarding the article goal about having a proper domain model, sentinel value and NOP are in the same problem: not representing missing values at a structural level.

              1. 1

                It depends on the domain though. Sometimes a null object is a legitimate way to represent absence (e.g. there are cases where an empty list correctly represents that a user hasn’t added anything to that part of their profile)

      2. 3

        Add Optional<T> to that list, since people often mistaken that for not being null in disguise.

      1. 14

        I think the underselling of the performance penalties of using UUIDs for keys makes this article borderline misleading. You are taking about massive increase in index storage, and at times extraordinary cost on complex queries across multiple tables (all those jumbo UUIDs add up), you lose clustering, you lose consistent semantics external tools and systems often expect (read: some search systems, /u/ephess pointed out even gist/gin indexes!).

        My main issue with it – this is a classic “works great in dev, explodes in prod” solution. As systems grow, you tend to grow in both data and query complexity, both of which often will bite you with UUIDs in non-linear ways.

        1. 5

          I’ve been using UUIDs in production for some time now, with no problems. I would advise against them in tables with lots of rows (let’s say more than 100M) but that’s not that common.

          1. 7

            Well – our production experience differs. I have helped organizations “walk back” UUID primary keys (at non-trivial cost) 4 times in the last 7 years. In each case, they were selected by a developer, not a DBA during the early dev process for convenience and “safety” without a real understanding of the costs when you put UUIDs in a X way join… or they get into complex stored procs. The scaling is poor because of the way the DB engine leans on those primary keys.

            I am not opposed to UUIDs, I use them all over the place, but NOT in the primary key the DB engine needs to use to do work. You can get almost most of the benefits of UUIDs with none of the overhead by just having them in a separate field.

            1. 4

              I have a similar experience to robertmeta (although, most of my databases have had a fair amount of data in them). There are two types of indices: machine indices, and human indices. It’s best to leave the machine indices alone (the ‘id’ column in your table), and to make up whatever other indices you need to feel comfortable. In the cases where you mix these two worlds you often end up with hard-to-reason-about performance issues and a lack of some rare but still useful tools (range queries on id).