For those who are interested, Chas will be speaking about CRDTs at Papers We Love NYC on Thursday, May 15th.
@cemerick and I have been discussing CRDTs and coordination (particularly, the tuplespace style of coordination, since the article mentioned tuplespaces) and it seems like that conversation can merge (heh) with the discussion here.
Can CRDTs be used for coordination?
Two examples to think about:
A distributed service for brokering 2 player chess games. Each player should be assigned to exactly one other player, subject to the constraint that the assignments are mutual. In other words, the service creates two-element sets that are all disjoint from each other. Can CRDT sets handle this kind of coordination? (A tuplespace approach to that example: , but this is not an AP system.)
A distributed service for unique assignment of tasks to workers. Two workers should not work on the same task. If a worker dies, the task should be assigned to another worker. Simply put, each task is performed exactly once. (In tuplespace terms: .)
Just to clarify the @QuiltProject tweet in question, what I was suggesting is not a coordination or consensus mechanism baked into the semantics of CRDTs (that is definitionally excluded). Of course, one can use an out-of-band consensus mechanism to inform reconciliations that are not possible within their constraints; this is the explicit advice in the literature (and has been used in practice), but that is unattractive to me for a variety of reasons.
There’s no reason why particular computational services might not require more constrained semantics, while still using the shared CRDT substrate as a reliable communications medium. For example, it would not require much novelty to operate a consensus protocol on top of a CRDT, with the leader acknowledging certain proposed writes/operations (“blessing” them, you might say) by signing them (or otherwise indicating assent, if e.g. the CRDT is being operated in an implicitly trusted environment).
I’m really excited to see that Chas has gone public with the Quilt Project and given it a name, and I’m reading this article eagerly, because it seems like it goes into a lot more depth than what he’s explained to me previously.
Are CRDT counters idempotent? (Or, do CRDTs always have to be idempotent?)
Because, I don’t understand how they can be. If they are and anyone can point me towards some writing or any implementation then please do.
It depends. CRDT counters as implemented in Riak are not idempotent, but for good reason.
If every client was an actor in a PN-Counter, and you wanted every game player of your mobile ‘phone game to increment the count, you would have a very wide PN-Counter (an entry per actor.) If these clients all read a CRDT from a database with a RYOW consistency level (or store their own writes locally, durably), then you can have idempotent counters, by dint of the fact that the merge function (the LUB) of a CRDT is idempotent.
Carlos Baquero et al did some work around this limitation in this paper: http://arxiv.org/abs/1307.3207
In Riak, we bound the number of actors to be the replica vnodes for a data item. This means that many, many, clients send operations to Riak (inc/dec.) without changing the size of the counter. The downside is “counter drift.”
If Riak receives an operation, and executes it on a vnode, but fails to replicate to W-1 vnodes, or if Riak succeeds, but the client never receives an “OK!” Then the operation is a partial success. This is indistinguishable from an actual failure from the client’s perspective. If the client resends the operation and there was a failure, OK! But if there was a partial success then we double counted. This is positive counter drift. If the user choses not to resend the op and there was a real failure, this is negative drift.
How to make this system model idempotent is the subject of current work. Trivially, you can make an idempotent counter CRDT like this:
If the client never re-uses the id, it can replay the op safely. Downside? That’s going to be a big set.
I’ve seen implementations where an ID lasts for some period of time, and is then dropped. And so you keep a set of ids + some roll up integer per actor, and you’re idempotent for failures that last for some period of time, after which retrying will lead to drift.
In general the update/mutate function on state based CRDTs are not idempotent, but the merge function is. If you send CRDTs between nodes in your system (including the clients) and you can always RYOW (either client local storage, or from a database) then you have idempotent operations.
Usually they are not. At least the ones implemented by Basho for Riak are not.
Yes, the operations performed over CRDTs are always idempotent. Riak’s counters (implemented here AFAIK) are state-based PN-counters, described in Section 3.1.3 of the Shapiro et al. CRDT techreport.
Each actor (or client, in the case of Riak) maintains its own count of increment and decrement operations; upon read, these are merged to yield the current value of the counter. The idempotency isn’t that an actor can issue an increment multiple times, and only the first will be acknowledged; it’s that one actor’s impact on the counter is not applied multiple times despite repeated replications / merging of its state.
If you do want actors to only be able to increment once, then you can build a CRDT counter comprised of two sets, one to track increments, the other to track decrements; getting the counter’s value requires taking the difference of the sets' cardinalities. An increment in “userland” would add an element to the increment set with a tag uniquely identifying the actor, likewise for decrements.
(I don’t work for Basho, so perhaps someone who does can correct me if I’m wrong somehow.)
I like this article, but I think that it assumes that all APIs are RPC. Many are, but ‘real REST’ is different enough that it tackles a lot of the things the author has beef with.
It would be nice for the author to not belabor the API, but the statefulness of historical RPC mechanisms. We all should be talking about how systems are composed using protocols, which ‘real REST’ is an instance of. APIs are fine when failure has be abstracted away, either by being rare or by being handled for you.
(Author here, hi.)
I picked on HTTP APIs a.k.a. “REST” the most because that’s what people use. I don’t believe that ‘Real REST’ addresses any of the substantive issues I described that are rooted in the RPC heritage and the programming models within which we might implement REST today. It is quite different than RPC, but still shares many of the same failings.
‘Real REST’ does provide a set of semantics that are more useful than e.g. HTTP APIs, but they either (a) continue to deny the fundamental nature of the network, or (b) leave the details up to every individual implementing a REST service or client. It also opens up a whole new can of worms with “hypertext” (viz. HATEOS) which is both under- and over-specified as a data representation and a mechanism for coordinating activity between two actors. Speaking of “two actors”, REST is predicated on two-party client/server interactions, and all that that entails. I could go on, but then it’d be another blog post. :-)