Looks interesting. Funny though; I ended up looking at the “Contact Us” page and noticed that all of the authors actually seem to be associated with the same university.
I wonder if other universities and the people associated with them tend to more aggressively “brand” projects made by the aforementioned people, or whether it’s more of a culture or a volume thing.
I think HyperDex is unique in this regard. Most research projects have a small web page, and seem dormant soon after the last publication. The end deliverable is a paper, and the code is an artifact of that.
Early on, we realized there was an opportunity to have real users for our project, and put in the effort to make it more than just a hacked-together prototype. This has paid off in spades, as it generates new and interesting research directions that we would not have otherwise thought about had we kept it a small research project with no outside users.
We just released the latest version (1.4.0), which brings performance improvements and a few bug fixes. More information can be found in the release announcement
I have seen a little talk on the mailing list about multi datacenter, have you sketched out a plan for it at all? Riak and Cassandra have a very solid multi datacenter story: the data model lends itself to write-availability and merging of data. But HyperDex takes the strong consistency approach which suggests you’ll have to pay some cost either in latency or availability.
KC2DQN here. Somewhat inactive recently, except for volunteering and community service work.
I will cop to still being somewhat unclear on the precise definition of linearizability, even after reading aphyr’s original Jepsen articles, the wiki article, watching this, etc. Does anyone have a good resource they would recommend? Is the original paper very readable?
Here’s the best one I tell people: “Each operation appears to take effect atomically between its invocation and its conclusion.”
The original paper’s not bad if you’re used to reading papers. I recommend reading only the first 3-4 pages if you are looking for the definition of linearizability.
What’s the paper you’re referring to?
Fortunately computer science preprints are relatively easy to find on the web, rescrv is probably referring to either the first link or the followup:
I think that this article shows a very interesting application of a technique known as “derandomization”. Usually, derandomization is when you take a randomized algorithm and remove all sources of randomness. In this application, we see it only partially derandomized, to great effect: copysets improve reliability by reducing overall randomness, but still leverage randomness to approximate the solution to an NP-complete problem.
I wonder if other algorithms would benefit from partial derandomization?
I think the best way to describe the insights of copysets and chainsets is that they add constraints to achieve a better result. As I understand it, derandomization helps in the case where sources of randomness are sparse and the algorithm requires more randomness than is available.
Both techniques are limiting the amount of randomness used, but they’re doing it for different reasons. One is optimizing the outcome; the other is reducing resource usage.
Another model I thought of to conceptualize the insight in this paper is that they more the randomization in the algorithm out of the runtime and into the initialization. Thus, it’s a deterministic algorithm at runtime, and thus they’ve reduced the scope of when randomization affects the behavior.
I think that’s a wrong characterization. Copysets and chainsets are about picking replica sets in a way that provides protection against correlated failures. There’s no “runtime” or “initialization” time inherent to either algorithm.
In fact, the way we use chainsets in HyperDex hides the details from every component except the system coordinator. Clients and servers simply see replica sets and have no knowledge of how those sets were chosen. It’s entirely possible to swap random replication into the place of chainsets, and see zero difference at runtime.
Interesting idea, but focusing the discussion entirely on data loss and ignoring the other potential operational impacts of failures (especially non-catastrophic ones) seems a bit short-sighted.
For example, consider a single machine failure in a simple quorum-based system using copysets. The replica set “partners” of that machine will experience a greater load increase than they would in a system that distributed data via random replication, since there is by definition more overlap between the individual machines' workload. That is, the operational impact of non-catastrophic machine failures scales inversely with the “scatter width” of the machines in question.
At large (10e3) scale this isn’t going to matter so much, but most people deploying these systems (including the authors' customers,) are probably operating at much smaller scales, where non-catastrophic failures are a much bigger part of day-to-day systems management.
N.B.: I haven’t read the paper.