1. 13

I started my programming career working on a proprietary ERP (Enterprise Resource Planning) system that still runs on a modified Clipper compiler. Most of my teammates had been working on this system longer than I have been alive, and throughout the years, they would always stress the importance of writing a “database-centric” backend. Now, what they meant by “database-centric” (consider “database” to refer to either Microsoft SQL Server or PostgreSQL) were as follows:

  1. Views were the contracts; you would write them thinking of your team and others.
  2. GET HTTP endpoints should avoid holding logic as they should be contained within the views.
  3. Inserting and modifying data should be done through the HTTP Endpoints to avoid giving write access to the database (database access was usually given to third parties too).

We followed these rules, and they worked quite well. I recall that we would catch contract breakages quite often because we were also consuming the views we wrote, and performance was easier to tackle as we only had to tune the database (and the tooling for SQL databases is awesome).

Given the context, I now have the same amount of experience working as a regular backend software engineer. I write RESTful or gRPC APIs in Go, .NET, and Python, though I use the same databases as in my previous experience. However, I have noticed that people usually avoid using the databases in the same way as I used to. Instead of views, we use OpenAPI or Protocol Buffer to define our contracts. Querying data is also done through HTTP endpoints instead of the views. I see that access to databases is usually discouraged altogether, which I find strange as, in my experience, the database is even the piece that has the least downtime of the entire stack.

I must confess that I still haven’t had the chance to work with the “database-centric” setup in my newest experiences. For this reason, I want to ask you! Have you ever worked this way?

  • What were the main pain-points?
  • Which tools did you use?
  • Would you have done anything differently?

Feel free to describe your experiences with similar setups!

  1.  

    1. 7

      The main pain points of views as interfaces are:

      1. If your consumers can’t handle making joins you can end up with views that have a lot of joins, hitting speed (or if you change the data model you can end up there)

      2. Your consumers need to actually be able to handle using a sql database or you end maintaining their queries for them

      3. Your consumers can write stupid queries that take down the database.

      4. Versioning is hard because you can only have one version of a view at a time unless you do search path hacks, and that still doesn’t help with the data model backing the view changing.

    2. 4

      The database-centric setup is fantastic, however:

      • Verbose as heck; expressive code is shorter and shorter code tends to contain fewer bugs IME
      • Lock contention is hard to solve
      • Vertical scaling has a ceiling (admittedly, a very very very high one which most applications will never see)

      Postgres-specific:

      • Performance is mostly great, but when it isn’t you get very limited tools to deal with it.
      • Table statistic updates can cause a fast system to turn into a slow one without you deploying a code change.
      1. 5

        Verbose as heck; expressive code is shorter and shorter code tends to contain fewer bugs IME

        One counter-point: I work on a Haskell project where some of the developers are much stronger Haskell devs than I am, but I have stronger pg and SQL skills. Several times I’ve written SQL queries and they’ve written a Haskell port, and we’ve done differential testing between them to ensure equivalent behaviour. In all cases the SQL was significantly more compact and succinct than the Haskell – it was pretty well written, and was fairly dense itself which is not at all uncommon for Haskell code.

        1. 3

          I’m capable (though not deeply knowledgeable) with window functions, aggregations, recursive CTEs etc, and there are definitely cases where the SQL version is the shortest and simplest.

          However, there are also often cases where it gets really ugly, and I often find some (much less efficient) ruby is 1/10th the size.

          1. 3

            I totally get that. I come from a ruby background (18 years) before switching my primary application language over to Haskell, so I get how compact ruby can be.

            And yeah, it can get ugly. Although I do think there are approaches to writing a true db-centric application in SQL (w/postgres) that also make for surprisingly compact code. I’ve worked on 3 of them now. My recent one is beautiful to work on:

            • SQL functions for queries mixed with db effects that can be pushed into the DB. Many of these are pure SQL, usually oranganized as CTEs to break steps down into a sequence with named steps). Although plpgsql is used for some problems better stated imperatively.
            • Views that purely query the data we need in an exact shape we need it.
            • All data described with domains, enums and composite types. Very little primitive obsession in tables, views, function arguments, and function return values.
            • Strong domain level CHECK constraints on domains.
            • Strong tuple CHECK constraints on groups of columns with functional dependencies.
            • Unique constraints on the minimal key(s) within each table.
            • Partial unique keys using conditional indexes to assert uniqueness based on some other state being reached.
            • Enums for all category type names that have relatively low cardinality (like a list of some colors).
            • Every table has a surrogate primary key (v7 generated uuid), and meta columns like created_at.
            • Foreign keys with RESTRICT on UPDATE and DELETE; by default.
            • Every table must have at least one natural key (uniquely indexed obviously).
            • Advisory locking for coordination of state changes between distributed jobs (eg. during multithreaded synching of data from a remote service).
            • Haskell as the “Glue Language”.
            • Every command or query executed via Haskell has inputs and outputs using strong (and well constrained) Haskell types.
            • The system is architected like a series of state machines for each flow of data.

            Anyway, I could go on and on, especially how we’re testing everything, especially the often overlooked DB side.. but maybe in the future. Its really great to work in this kind of environment. I hope to hear more people talking about this if they are doing something they’ve not seen before.

            1. 3

              Most of that makes sense to me and is somewhat familiar.

              I’m interested in how you handle changes, particularly to function signatures.

    3. 2

      back in the day at $bigcorp when someone said ‘backend’ they meant the database. everything was in stored procedures and materialized views and I felt it was very difficult to work with.

      1. 2

        Do you mind sharing what were some of the biggest pains you dealt with? And if possible, some of the things you liked too.