1. 54

In response to https://lobste.rs/s/tpvf1r/you_do_it_too_forfeiting_partition_tolerance_in_distributed_systems

  1.  

  2. 20

    People seem not to realize (or not to want to realize) that CAP is a mathematical theorem which is rigidly true in every circumstance and cannot be talked around or ignored. It’s simply a fact. In the presence of network partitions, your system will sacrifice consistency or availability (or, in a no-doubt worryingly large number of cases, both); the most you can do is pick. (It is a safe bet, by the way, that availability is the wrong choice [edit: to keep; that was entirely unclear, sorry].)

    (As an amusing-to-me aside, CA is what every system should be in the absence of partitions. If your system cannot deliver both consistency and availability even when everything is going right, it is worse than useless.)

    1. 14

      As an amusing-to-me aside, CA is what every system should be in the absence of partitions.

      Sometimes, for latency reasons, you might want lower consistency even when the network is fully connected. Figuring out when you want that, though… maybe an open research problem, haha.

      1. 2

        As an amusing-to-me aside, CA is what every system should be in the absence of partitions.

        As aphyr said, you may want to do this for latency reasons. For example, theres PA/EL in Abadi’s PACELC taxonomy. Many traditional architectures with replications trees of relational databases offer these tradeoffs, as do many “quorum”-based database systems.

        Along with latency, there’s also scale. Again, with relational databases it’s fairly common to have async replicated read replicas take read-only traffic, or to have multi-master configurations with async replication. These systems choose A over C entirely for performance reasons, and may not actually intend to be available under partitions. In fact, many choose a “bounded staleness” model, where they stop returning data after some known staleness, which is not possible to achieve with full availability under partitions. These kinds of systems - very sensible systems - are neither C (linearizable) or A (fully available) under partitions.

        1. 2

          This is true. Actually, the extreme strength of the notion of consistency Brewer used (that is, linearizability) is a point that can be used to argue against the conclusions of the CAP theorem, because depending on the data model, many systems can be meaningfully consistent without full linearizability.

          I’m not aware of any work to prove (or disprove) the CAP theorem for different notions of consistency, though I would conjecture that the lower bound on consistency possible while maintaining availability is uselessly low.

          1. 4

            I’m not aware of any work to prove (or disprove) the CAP theorem for different notions of consistency

            I suggest http://www.vldb.org/pvldb/vol7/p181-bailis.pdf, which includes a handy summary of impossibility results for various consistency models.

          2. 1

            I can not remember the name good enough to find it in google, but MSR had an interesting paper trying to figure this out to some degree. Pileaus or something.

            1. 1

              This? http://research.microsoft.com/en-us/projects/capcloud/default.aspx

              (looks like your memory is totally CA.)

              1. 1

                Yep, that’s it! That only does these adaptive things on reads.

        2. 4

          Thank you for taking the time to understand what the upstream author was saying. It moved so fast that it was hard to pay attention to all of it, and I completely missed some of the bizarre claims, like “Many CA systems are not CP”. (As you say - indeed; in fact, none of them.)