Interesting idea, but focusing the discussion entirely on data loss and ignoring the other potential operational impacts of failures (especially non-catastrophic ones) seems a bit short-sighted.
For example, consider a single machine failure in a simple quorum-based system using copysets. The replica set “partners” of that machine will experience a greater load increase than they would in a system that distributed data via random replication, since there is by definition more overlap between the individual machines' workload. That is, the operational impact of non-catastrophic machine failures scales inversely with the “scatter width” of the machines in question.
At large (10e3) scale this isn’t going to matter so much, but most people deploying these systems (including the authors' customers,) are probably operating at much smaller scales, where non-catastrophic failures are a much bigger part of day-to-day systems management.
N.B.: I haven’t read the paper.
[Comment removed by author]
We (I’m one of the authors on the linked article) introduced the notion of predecessor width precisely to enable capacity planning. If catastrophic events are not likely, or not a problem, you can set a high predecessor width, and increase the number of nodes from which each node recovers in the event of failure. By decreasing predecessor width, you decrease the chance of catastrophic loss, and decrease the number of nodes each node may use for recovery.
HyperDex will automatically heal itself when a node goes offline (and comes back), so it (hopefully) won’t wake you up at 3am.