CASPaxos is a replicated state machine (RSM) protocol, an extension of Synod. Unlike Raft and Multi-Paxos, it doesn’t use leader election and log replication, thus avoiding associated complexity.
I’ve not fully digested the paper–I’ve only read it once, but this could be a fairly big deal, if peer review pans out. (I’ve not tried to understand the proofs, myself yet, but they exist, as does 2 independent TLA+ specs, apparently)
The paper doesn’t explicitly mention it, but I do have to wonder if the set of side effect free functions can be anything, or somehow must be constrained (say, to CRDTs). Like, would it be possible to implement the rich data types of Redis (hashes, lists, sets) etc without requiring additional constraints?
The Acceptors obviously don’t run that function until confirmation, so I believe it doesn’t actually matter. Especially when you consider the other aspect that the provided update function (for the CAS register), doesn’t introspect the value, only the ballot. And, I guess delete out of a list is just an update function, handled no differently than anything else (e.g. keeping the ballot number around that made the change). Delete of the whole key, of course, requires the GC run mentioned–which I also thought was super clever!
The other aspect of this not mentioned is if all nodes in the system fail. Paxos deals with this by having the entire log to replay. Raft, Zab, and all the others, too. CASPaxos leaves this unspecified, so you’d better run your update function in a transaction or something if you need that ability….