1. 67
  1.  

  2. 26

    Preventing stale reads requires coupling the read process to the oplog replication state machine in some way, which will probably be subtle–but Consul and etcd managed to do it in a few months, so the problem’s not insurmountable!

    Consul and etcd implement Raft which is a proven consensus algorithm. Part of the issue with MongoDB, based on my following it from a distance, is they seem to be building their own distributed systems algorithms and they don’t appear to have the talent to accomplish it.

    And remember: always use the Majority write concern, unless your data is structured as a CRDT! If you don’t, you’re looking at lost updates, and that’s way worse than any of the anomalies we’ve discussed here!

    Does this actually help you in MongoDB? I am under the impression MongoDB does not support CRDTs so it will simply drop writes, as shown in the analysis.

    On an emotional note, it’s so distressing reading about MongoDB. It would be one thing if they were just reimplementing the last 30 years of database technology because of NIH. But they are reimplementing it wrong. Yet it is massively popular. These are the things that make me depressed about the software industry and want to move to a farm.

    1. 9

      +1 for wanting to move to a farm. And not just because some people are doing stupid things, because well seasoned developers are being ignored. It’s no longer that people just want to do the right thing, it’s that people want to be doing something so long as there is motion! More lines of code, more bug trackers, more issues fixed, more complexity, more features… less problems solved.

      1. 4

        But MongoDB is web scale.

      2. 7

        Have I ever told you of the merits of raising goats?

        1. 2

          Consul and etcd implement Raft which is a proven consensus algorithm.

          There really aren’t any proven consensus algorithms running in the wild – and absent a system which mechanically and correctly translates proofs into code, there probably won’t be. Etcd and Zookeeper both have had consistency bugs, despite their theoretical backgrounds, owing to implementation errors. The often forgotten part of every distributed system is the ability of each and every human implementor of code in the critical path to have fully understood all of the possible failure cases at the time they wrote the code.

          CRDTs can be application-resolved so every database ‘supports’ CRDTs, with smart enough applications. The implementation is often profoundly unpretty, though.

          1. 7

            There really aren’t any proven consensus algorithms running in the wild – and absent a system which mechanically and correctly translates proofs into code, there probably won’t be.

            This is exactly why I said the algorithm is proven, not the implementations. MongoDB is not even running a theoretically proven algorithm, it appears to be the a patch-work of attempts to get something working.

            CRDTs can be application-resolved so every database ‘supports’ CRDTs, with smart enough applications

            How is this possible if the database drops your writes?

            1. 2

              Sorry, I thought you were using the argument to authority, and wanted to highlight the difference.

              CRDTs don’t have anything to do with consistency in the face of failed writes; they’re merely a technique for resolving differences between two apparently correct values with data structures.

              1. 2

                CRDTs don’t have anything to do with consistency in the face of failed writes; they’re merely a technique for resolving differences between two apparently correct values with data structures.

                If I’m dropping writes then I’ve lost the other value, which is the problem.

                1. 0

                  that’s an orthogonal problem. Every database can drop writes given sufficient partition; that doesn’t stop some from having CRDT implementations.

                  1. 0

                    Your statement was:

                    CRDTs can be application-resolved so every database ‘supports’ CRDTs, with smart enough applications.

                    The application cannot resolve anything if it does not have all of the writes because they have been dropped by the database.

                    So no, it is not an orthogonal problem.

                    Every database can drop writes given sufficient partition

                    Dropping writes doesn’t have to have to do with partitions, it’s about accepting a write then throwing it away.

                    1. 2

                      I feel like you’re wilfully ignoring the causal arrow in my statements, so I’m ending the conversation. Good luck!

                      1. 1

                        I am sorry you feel that way, you could simply explain how a CRDT helps when the database is discarding writes.

                2. 2

                  CRDTs don’t have anything to do with consistency in the face of failed writes; they’re merely a technique for resolving differences between two apparently correct values with data structures.

                  They are data structures that consistently resolve causally parallel modifications into a single successor. If your DB does’t natively support them or expose the conflicts in any way, you cannot apply that technique.

                  For example, if you have a value A in the DB and then it accepts two with that single ancestor, let’s call the writes A' and A'‘, and then internally resolves the conflict to either of them, where do you apply the CRDT merging logic?

              2. 3

                CRDTs can be application-resolved so every database ‘supports’ CRDTs, with smart enough applications. The implementation is often profoundly unpretty, though.

                You can only use them if the DB offers some sort of control for conflict merging, right? If it drops the conflicts on the floor or just resolves to an arbitrary write CRDTs would not help you in any way.

                1. 1

                  That’s what CRDTs are: a conflict resolution mechanism. You can implement CRDTs with dumb pencil and paper, if you like.

                  1. 0

                    But you need the conflicts in order to resolve them, which you don’t have if the database throws them out.

                    1. 1

                      How in earth is this starement incorrect?

            2. 13

              This post is excellent and I wish everything I read had this level of rigor.

              1. 4

                If you’re into this kind of anomaly analysis, you may also want to check out Martin Kleppmann’s Hermitage tool: [github] [blog post]