1. 55
  1.  

  2. 19

    Never use a database because ‘everybody else does’, do your own research as to the advantages and drawbacks of a particular database.

    The gist of the blog post.

    Also, something worth mentioning is that the writer of this post was associated with LulzSec.

    1. 4

      One day we’re international cybercrime villains, the next, we’ve grown up to write blog posts about databases

      1. 2

        Be careful with the information in this post. MongoDB was trolled a while back on Hacker News and the CTO of 10gen (MongoDB’s former name) shunned a lot of the claims that were made.

        1. 1

          Yeah, as a particular example in “… has locking issues (sources: 4)” the source linked is talking about table locking issues in MySQL when changing schemas. Nothing to do with MongoDB.

          Personally, I quite like Postgres as a JSON store …

          1. 3

            Yeah, as a particular example in “… has locking issues (sources: 4)” the source linked is talking about table locking issues in MySQL when changing schemas. Nothing to do with MongoDB.

            I don’t think that description of source #4 is correct. It’s definitely about MongoDB and moving away from MySQL+Mongo to Postgres, and only mentions MySQL schema locks in a single paragraph, not the entire article, nor as the cause of the move. Note that there is still not a crystal clear link between “has locking issues” and the citation though. I can only assume that the OP is implying that the following (or most of them) are caused by locking issues:

            For example, at some point in time we had to remove about a million documents from MongoDB and then re-insert them later on. The result of this process was that the database went in a near total lockdown for several hours, resulting in degraded performance. It wasn’t until we performed a database repair (using MongoDB’s repairDatabase command). This repair itself also took hours to complete due to the size of the database.

            In another instance we noticed degraded performance of our applications and managed to trace it to our MongoDB cluster. However, upon further inspection we were unable to find the actual cause of the problem. No matter what metrics we installed, tools we used or commands we ran we couldn’t find the cause. It wasn’t until we replaced the primaries of the cluster that performance returned back to normal.

            These are just two examples, we’ve had numerous cases like this over time. The core problem here wasn’t just that our database was acting up, but also that whenever we’d look into it there was absolutely no indication as to what was causing the problem.

            Either way, the author of the cited article definitely had some problems MongoDB and trouble diagnosing them, which is worth note in itself, even if we consider the link to locking unclear.

      2. 28

        If data have structure store Postgres. If data no have structure store Hadoop. If data no have value store Mongo.

        © 2013 Big Data Borat

        https://twitter.com/bigdataborat/status/338092796374294528

        1. 11

          This resonated a lot. I also worry a lot about the people who are choosing Mongo because they want to store a lot of deeply nested JSON objects, which seem to be the majority of people defending Mongo that I’ve seen.. in my experience this will also lead to a lot of pain. JSON does not have type validation, objects might not have all keys, keys might represent different things at different times, and good luck enforcing uniqueness constraints… if there was a sane way to move away from it in HTTP API’s (Protobufs? RPC’s? Go structs on both ends?) I’d love to see it.

          1. 4

            Maybe Apache Thrift?

          2. 4

            Irresponsible software loses data. Film at 11.

            1. 3

              I don’t understand why Elasticsearch is hardly ever mentioned as a document store. If you’re going off RDBMS it’s normally because you need to denormalize some data into documents for more complicated querying. Elasticsearch can store denormalized JSON and index/query in powerful ways. Canonical data = RDBMS. Denormalized data for complicated querying: Elasticsearch.

              1. 2
                1. 3

                  This is surprisingly hilarious, I haven’t laughed that hard in ages!

                  It’s also at http://www.mongodb-is-web-scale.com/, but the video is so worth it!

                  P.S. BTW, the farm/ranch retiring is for real – http://dtrace.org/blogs/wesolows/2014/12/29/fin/.

                  1. 5

                    If it’s funny the first time, it’s funny every time! :D

                  2. 1

                    The value is when you need automated horizontal scaling. Last time I looked (admittedly a few years ago) there was nothing else that did consistent hashing for you for free, so you could just completely automatedly spin up new nodes in response to load (on AWS or the like) and add/remove them without needing a human to be awake.

                    If you’re not using that feature then sure, don’t use it, there are a lot of things wrong with it. But that’s the use case, and “just use postgres” doesn’t cut it.

                    1. 9

                      Last time I looked (admittedly a few years ago) there was nothing else that did consistent hashing for you for free, so you could just completely automatedly spin up new nodes in response to load (on AWS or the like) and add/remove them without needing a human to be awake.

                      In what way do Riak and Cassandra not provide this?

                      1. 1

                        I haven’t had the chance to use Riak (it would have been immature at the time). Maybe it replicates the mongodb featureset now.

                        In my limited, non-scientific experience (different admin teams at different companies) I’ve seen Cassandra have more issues than mongodb. A claim I’ve heard is that while a correctly-configured Cassandra will work correctly, very few real-world Cassandra clusters are correctly configured.

                    2. 1

                      Overheard the other day “MongoDB is the Snapchat of DBs”.

                      All joking aside, Mongo can be used somewhat safe and effectively as a cache. For anything else, though, it just brings back painful memories of BerkeleyDB going corrupt for no reason after months of sane behaviour. Everything old is new again.

                      1. 1

                        Be careful with the information in this post. MongoDB was trolled a while back on Hacker News and the CTO of 10gen (MongoDB’s former name) shunned a lot of the claims that were made.