1. 17
  1.  

  2. 9

    I’ve dealt with a few database bugs, where after it was fixed, we asked ourselves, could this have been prevented with a constraint. Many times yes. Then we ask, if we had decided to use constraints, would this particular constraint have been one we added? Almost always no.

    Ranges can also be used, for example salary >= 15000 AND salary <= 100000

    This seems like a particularly bad example of a constraint. It might even be one of my strawman examples to prove constraints are a bad idea. All too often constraints are added by somebody who lacks foresight, and then the application says “no soup for you” and nobody can be found to fix the problem. See also falsehoods programmers believe about names, addresses, pretty much anything.

    1. 3

      Agreed.

      In fact I go further and I generally recommend a data-native approach, were UI and importers “accept” data in as natural a format as possible, with as few restrictions/constraints as possible.

      I’m aware how this flies in the face of “best practice” database design which recommends a golden 1NF and trigger-based denormalisation for extra performance, but here’s my reasoning:

      If I capture what the user actually said/emailed/typed/etc, then if I change my opinion about what the application should do with the data, I can simply reprocess it. It’s easy to run-experiments, because I’ve got all that historical data and updates.

      Constraints? Just another input. When I process that native, I’ll have constrains that check for problems, and where there are problems, I can flag them by putting the constraint-violation in the database literally by having a column next to the input record as the status of the import (successful, nope: “error message”, etc).

      This makes it very easy to adjust and review the constraints without blocking input, and without ever asking anyone to “re-upload” or “re-type” that data.

      Should postal codes be limited to numeric? Maybe. Maybe not. Let the user type in whatever they want, and then decide in the application what actually needs to know.

      Oh, here’s a shipping module that calculates shipping prices in the US. It requires numeric postal codes, so we can flag a constraint violation if the user tries to ship using that module to that user – but we still haven’t lost anything. We can decide if we want to manually process the record, or implement support for UK postcodes.

      Here’s another one about phone numbers. Do we need 10 digits and nothing but digits? Maybe, and again: Maybe not. Are we going to call them? Are we going to make a software that calls them? It can simply report that error if it only supports US-numbers, or we can implement ITU numbers or other international dialling prefixes. The business can truly decide what we want to do!

      I’ve had to deal with “messy” data for the last several decades and I’ve found nothing better…

      1. 1

        If I capture what the user actually said/emailed/typed/etc, then if I change my opinion about what the application should do with the data, I can simply reprocess it.

        Are you sure you can? The end user may have made other decisions based on what your software told them in response to their input, at the time they entered it and using the previous rules. If they did something technically wrong on purpose as a work around for a bug in your code, then you change your code and reprocess the original request without that bug, then their workaround may now be wrong.

      2. 3

        Check constraints should be used with caution, but I think unique/not-null and key constraints are pretty useful. Especially with bulk imports from 3rd parties.