Since the spanner public api has launched, there has been a deluge of content being pushed out that is ignorant of the single most important quality of a large-scale storage system: the pricetag. Sure, you can chuck everything behind a paxos group, and have a few fractal iterations of such availability-boosting replication, which google does, but most orgs don’t care enough about C+A to have 9-25x write amplification. Google is an advertising company, so they do.
don’t settle for it in what use case? at what scale? at what cost?
there are a world of other trade-offs in database systems, especially distributed ones, that the author is ignoring.
The Gilbert/Lynch CAP “theorem” is based on the same conceptual error as the much older Fischer,Lynch, Patterson “theorem”. If you assume your system has no upper limit on slow response, then you can trivially prove that it’s impossible to correctly detect partition error or device failure. However, when reasoning about engineering systems, that abstraction makes the model useless.
The main idea is that in the asynchronous model an algorithm has no
way of determining whether a message has been lost, or has been arbitrarily
delayed in the transmission channel.
This is only true if e.g. an RPC that does not complete in 1 second is equivalent to one that has not completed after a week. In both cases, in the model of an “asynchronous network” all we know that that a response has not arrived yet, but because of the definition of asynchronous, it cannot be determined by the sender whether there is an error condition or whether the destination is just taking its time to send the response. But that’s just a flaw in the model, not an interesting observation about real networks.
See also http://www.yodaiken.com/2016/01/14/more-on-fischer-lynch-patterson-and-the-parrot-theorem/
Not really clear to me what the author means. Should everyone just use Spanner? There isn’t anything else out there like Spanner (although CockroachDB is trying).
I don’t think there’s a production-ready equivalent that you can run yourself (closed or open source).
FoundationDB had a bunch of the guarantees, minus the SQL interface. Then Apple bought it and shut it down right away (side note: how terrifying is the idea of your database software no longer being available?).
CockroachDB doesn’t seem quite there yet. I really want it to be.
Coincidentally, there is a paper that has a very similar name, but a vastly different topic:
This spanner release looks like a huge blow to cockroachDB, though the price is very high.
Reminder that nothing is tradeoff-free.
Reminder that you’ll have to structure your data a certain way not to run into a throughput wall (true of many databases).
Reminder to read the Spanner paper to find these things out.
That said, Spanner seems dope.