If these trade-offs seem familiar, they’re straight from the worse is better essay. It turned out that correctness, simplicity of the interface, and consistency are the wrong metrics of goodness for most users.
I would say it is more likely that people who care about these things prefer PostgreSQL.
I evaluated RethinkDB twice at two different jobs and wound up not choosing it because it wasn’t fast enough.
I’m a Postgres guy who wouldn’t touch MongoDB with a ten foot pole. I was only evaluating RethinkDB because Aphyr’s evaluation showed it was correct and wouldn’t lose my data. But in the end I need both speed and correctness. I put RethinkDB in the “look at this later” category, and I think a lot of other people did too.
Seconded. I heard an interview a while back with one of the RethinkDB folks, talking about the realtime push capabilities. I thought the same thing I think every time I hear about a new database: “That sound cool, but… is it going to lose my data?”
It’s cool to be able to push out updated data as soon as it’s available. But not if the data isn’t reliable. And if I have to use something else as my primary data store to guarantee reliability, then unless it supports push notifications, I’m back to polling the primary store.
I’m not even close to being a database admin (even though I wrote “Protect Your Data with PostgreSQL Constraints”). I don’t know much about things like sharding and clustering. I can’t explain all the details of ACID. But I know I want my database to never lose data or store incorrect data, and my default (perhaps not perfectly informed) choice is always going to be the thing I trust to do that. Nothing will pry my fingers off PostgreSQL unless it convinces me it’s at least as reliable - that needs to be Point 1 before I start caring.
(If for some reason I don’t need reliable storage - eg for a cache - I might pick something else.)
Speaking as a Postgres bigot, I would very much be interested in a different data store that addressed different pain points than Postgres, if it were as well put together as Postgres.
This is a great example of the maxim that who you compete against is really decided by your users, not you.
yeah, and he’s saying palpable speed is a painpoint, which is a BIG problem if you’re not even a relational db.
[Comment removed by author]
I actually think RethinkDB’s strategy would have worked if they had had, say, 10 years of runway or so. But that’s a lot of years.
Unfortunately for startups in the area, it’s difficult to pull off a real, from-scratch, successful database implementation in much less time than that, I think. MySQL is a counterexample that did manage it, but imo they had a lot of in-the-right-place-at-the-right-time luck. The original MySQL incarnation wasn’t even (at all) well regarded as a DB, but it was open source and downloadable just at the time when open source, x86 commodity servers, and the web were all taking off, and they had an almost comically opposite foil in Oracle, as the primary incumbent competition in that space. That’s a pretty difficult set of conditions to repeat!
Postgres is taken as something of a gold standard nowadays, and its runway was closer to 15 years, admittedly most of that in academia, which sometimes works at a different pace. The first prototype came out in the late ‘80s, the first open-source release in 1994, the first port to use SQL as a query front-end in 1996, and it began to gain serious traction with actual users, slowly gaining on MySQL’s space, by my estimation somewhere starting in the early 2000s (I don’t remember it having serious mindshare in the first dot-com boom).
I remember it coming up in a comparison of databases some time in 1999 or maybe early 2000 but don’t remember why it didn’t make the cut: we ended up having to use Informix (IIRC) because it actually had transactions whereas MySQL didn’t.
RethinkDB was also AGPL, as I recall. That is a bit of a hard sell to businesses when PostgreSQL is a viable alternative, regardless of the fact that the drivers were all apache licensed and 99.9% of people aren’t going to be modifying the database code themselves anyway.
Honestly I have always been a bit surprised mongodb has done as well as it has, given that it also uses the AGPL license. I chalk it up to mongo lucking out and being associated early on as part of the js-everywhere/node hype train stack.
Both are dual-licensed under AGPL and a commercial license. I don’t think the expectation was that many large companies would use the AGPL version, but that they’d license the commercial version for enterprise use. Basically a variant of the traditional GPL-as-poison-pill / commercial-license-as-alternative model, but with a stronger brand of poison. Certainly that’s what most large users of MongoDB are doing.
I understand that, but I’m not sure many businesses would be comfortable with it regardless, even presuming they had people that likewise understood that.
Apparently I am not alone in thinking it can occasionally be a problem either.
EDIT: Futher, it seems like it was super hard (if not impossible?), to actually buy a non-AGPL license, which I am sure didn’t help.
Note that this is from a folder called “_drafts”.
It’s since been moved to https://github.com/coffeemug/defstartup/blob/master/_posts/2017-01-18-why-rethinkdb-failed.md.
<mondaynightquarterback>I didn’t see it specifically stated in here but was danced around a bit, but it seems like they didn’t build a product that solved a problem their perceived customer base had. The author mentions that they chose the wrong metrics and that they felt resentment every time MongoDB released and got praise, but the fact that the lack of sales didn’t spur (at least according to this account) the company into finding a product their customers did want seems like why they failed. MongoDB, after all, started as a company building a cloud solution then AppEngine (I think) blew them out of the water so they changed direction. Sounds like RethinkDB stuck to its guns despite all the lack of sales showing them to not.</mondynightquarterback>
On a small other note: I always really disliked the name “RethinkDB”. For me, that implies risk and experimentation, which are the last things I want in a production database system.
To me “mongo” never sounded very reassuring either, but maybe it’s a language thing. See: Swedish politically incorrect slang.
Funny, I’ve thought the same thing - it sounds a little too close to some rather insulting terms in English (I’m a native English speaker).
It has the same connotation in German.
In Spanish, at least where I lived in Peru, “mongo” is slang for “stupid”.
Specifically, the term it sounds close to in English is a racial slur (although an old and hopefully-disused one) used to suggest low intelligence. I’d imagine that’s why it has cognates in so many languages.
It’s not a racial slur as such - it’s an old name for Down Syndrome that gets used as a slur. (The reason it was used as a name for the syndrome in the first place was the racism of the time, but the current usage isn’t really racial AIUI)
Yes, that’s what I was trying to say. I’m not particularly happier about it having been an ableist slur after it was a racist one, but mostly I wanted to point out the history, because for words where this stuff happened a long time ago, a lot of people simply have no idea.
It’s Javanese for “hello”, which always struck me as a nice name until I learned that it wasn’t implemented in Java.
Mongo only pawn in game of life
Reminds me of mongoloid in English slang.
The post is now published and available at http://www.defstartup.org/2017/01/18/why-rethinkdb-failed.html
“RethinkDB is not a good choice if you need full ACID support or strong schema enforcement—in this case you are better off using a relational database such as MySQL or PostgreSQL.”, when I hear DB a relational database comes to mind. Perhaps this was part of the problem with their adoption. I did not learn until today that Rethink wasn’t a relational database.
I think you might just be behind the times. MongoDB has DB in the name. The Apache Cassandra wikipedia entry calls it a database. Most people I interact with don’t think ‘relational’ when they hear ‘DB’.
Parts of the Amazon shopping cart make use of Dynamo. I’m guessing things have moved on a lot in ten years, but the same principles still apply. Stock checks at their scale are not an easy problem to solved (particularly if you’re tweaking C, A and P as you go!).
Amazon actually doesn’t use Dynamo at all. Confusingly DynamoDB is not a Dynamo implementation.
I wasn’t aware that they didn’t use Dynamo? I know that DynamoDB implements some of the concepts of Dynamo but is quite different underneath, but I thought they still used their own internal implementation of the concepts set out in the Dynamo paper.
I know that DynamoDB implements some of the concepts of Dynamo
DynamoDB implements NONE of the concepts in the Dynamo paper. It’s a completely different architecture.
I thought they still used their own internal implementation of the concepts set out in the Dynamo paper
There was a Dynamo implementation, and it sucked. Anyone who used it moved off it as fast as possible due to stability and scalability issues. It was never anywhere close to the quality of external Dynamo implementations like Riak or Cassandra.
Fair enough, but reading Werner Vogel’s DynamoDB announcement would suggest otherwise:
We concluded that an ideal solution would combine the best parts of the original Dynamo design (incremental scalability, predictable high performance) with the best parts of SimpleDB (ease of administration of a cloud service, consistency, and a table-based data model that is richer than a pure key-value store). These architectural discussions culminated in Amazon DynamoDB, a new NoSQL service that we are excited to release today.
the best parts of the original Dynamo design
It should read “the best features of the original Dynamo design” as neither the incremental scalability nor predictable high performance of DynamoDB have anything to do with Dynamo. In fact, those two features in DynamoDB come from an explicit rejection of the Dynamo ring concept.
In my opinion predictable high performance is NOT a feature of the Dynamo paper at all, as the Dynamo ring is by nature unpredictable. Even quality Dynamo implementations like Cassandra are unpredictable when you have to do things like upgrades because you don’t know what will happen to the ring when a node leaves. The theory is everything will work itself out, but in practice it’s pretty scary unless you’ve heavily over-provisioned your cluster relative to your normal state workload.
There was a Dynamo implementation, and it sucked. Anyone who used it moved off it as fast as possible due to stability and scalability issues
Is there a source for this? I’ve always wondered about what the state of Dynamo is like inside Amazon.
Primary source: I used to work on DynamoDB.
Interesting that the difficulty of making money from open source is never mentioned. Compare to, for example, MarkLogic.
To add to that, I about got burned in this same space recommending FoundationDB. It was only thing close to Google’s database at $50 a machine or something. That had a good chance of sticking around even in legacy mode. Apple bought it and took it off market. (Shakes head)
Yeah, avoid proprietary stuff for anything critical where possible as there’s a bunch of extra risk. Next best is using open protocols or formats where you can at least switch vendors. Gotta run tests on each ahead of time, though, for incompatibilities that can bite you big later.
There is no risk free path as either a vendor or consumer.