1. 13
    1. 3

      All but Serializable, which is the strictest, are weak and allow some kind of interference between transactions.

      At the risk of being all nitpicky, serializable also allows interference between transactions. Serializable transactions run as if they are run in some serial order, not necessarily in the serial order you expect. CockroachDB is one very interesting real-world example where this can happen - their serializable level can experience a phenomenon called “causal reverse” (see https://www.cockroachlabs.com/blog/consistency-model/) where the serial order doesn’t match what most would consider the common-sense causal order. Strict serializable is a level that doesn’t allow this kind of anomaly (and what you get from most single-node serializable DBs because it’s kinda free on a single node), but even that can have surprising concurrency behaviors when composed into a larger distributed system.

      Semi-related, one great way to understand the anomalies that can happen is to look at the source for Hermitage (e.g. https://github.com/ept/hermitage/blob/master/postgres.md) which clearly shows the anomalies in plain simple SQL. The academic approaches (Adya et al, Berenson et al’s “Critique of ANSI SQL…”, etc) are super useful for theorists and people building databases, but can be super confusing for most developers.

      1. 2

        That’s terrifying — as you said, “it seems to go against the very definition of what a transaction is.” The only databases I’ve used seriously are SQLite and LMDB/MDBx, which don’t have this problem. (Admittedly they don’t allow concurrent writes, but they do allow concurrent reads during a write transaction.)

        I assume performance is the reason weak isolation exists? How much slower is it to use strong isolation in Postgres?

        1. 2

          I assume performance is the reason weak isolation exists? How much slower is it to use strong isolation in Postgres?

          Scalability, too (to add to David’s answer). In a distributed database, serializability requires coordinating with all partitions that were read (to detect R-W conflicts), which snapshot isolation for example requires coordinating only with partitions that were written (because only W-W conflicts matter). In workloads that do a bunch of reads for each write (super common in OLTP), that difference can matter a lot!

          1. 2

            To add a point nobody has mentioned yet, Serializable transactions are handled very differently in most databases.

            A serializable transaction may not be able to complete because another transaction completed first and invalidated the entire transaction. Most Databases if they are unable to fit the transaction in will abort it with a specific error and return it to you.

            You’re supposed to retry serializable transactions manually then.

            One example would be if you had a table of pixels in an image. One transaction turns all white pixels black and the other turns all black pixels white. Under all but serializable isolation, you might get something you didn’t want, but it’ll proceed. Under serializability only one of those transactions can succeed, as the other’s results are no longer valid against the underyling dataset. So you need to retry it to ensure that what you wanted is still correct in a new transaction.

            1. 1

              I assume performance is the reason weak isolation exists? How much slower is it to use strong isolation in Postgres?

              Like everything else, it depends on the workload. If you have a transaction that reads one field and updates another, then it will abort if something else writes to that field at the same time, and will then need replaying (or it will block the second transaction until the first completes, I’m not sure how optimistic Postgres’ concurrency model is). In most cases, this is fine. Imagine you have fields A, B, and C. If you read A and then write C depending on it then, logically, this transaction is unordered with respect to one that writes A: it could happen before or after if they both arrive concurrently and so there’s no need to have them isolated. If it reads A twice, that’s a problem. If it reads and writes A, then it will be implicitly ordered either before or after the transaction that writes A, because there will be a write conflict.

              Guaranteeing read isolation means either rolling back or serialising more transactions, depending on the underlying implementation.

              As I understand it, Postgres uses MVCC and acquires locks for things that are written (though, I believe, this can be done differently in different storage layers). This means that more transactions will end up hitting locks (though, hopefully, R/W locks for the reads, so they’ll still execute in parallel). This means that the transaction that writes to A will block execution of all of the transactions that read A and write C. If the transactions read A and add a new entry derived from it, rather than modifying the existing one, then all of these can execute in parallel without read isolation. For extra fun, if a transaction reads A twice and doesn’t modify anything, then it still can’t execute in parallel with the write, which may mean that writes really hurt tail latency in read-heavy deployments.

              1. 2

                Postgres implements MVCC using a write-ahead log, so it’s basically copy-on-write. It doesn’t need read locks, it just reads an old version of the row so it gets a consistent view of the database regardless of concurrent writes — if you set a high enough transaction isolation level. This is why Postgres has a vacuum process: it needs to clean up and reclaim the space used by old versions of rows.

                Postgres uses predicate locks to detect serialization conflicts, but they cause aborts rather than blocking. There are other kinds of locks that can block, but they are mostly explicit in the SQL, e.g. SELECT FOR UPDATE.

                1. 1

                  The WAL is not the same thing as multiple row versions in the pages and it does not relate to vacuuming.

                  1. 1

                    Oops, thanks for the correction!