1. 10

  2. 7

    One can imagine a scenario where one of the nodes has a latency higher than it’s election timeout, causing it to continuously start elections

    I have used Raft in production and can confirm this is a real thing that happens. Here’s an issue on the etcd repo discussing this problem https://github.com/coreos/etcd/issues/7970. basically: “a 5-node cluster can function correctly if two nodes are down. But it won’t work if one node has slow disk.”. Which is a weird failure mode!

    also I found this post valuable, I’d never read the raft website/paper but this appeared in my RSS reader and it caused me to actually learn more about how Raft works!

    1. 3

      Using Raft’s PreVote extension should alleviate this issue, though etcd’s current PreVote implementation could be more stable. See https://github.com/coreos/etcd/issues/8501, https://github.com/coreos/etcd/pull/8517, https://github.com/coreos/etcd/pull/8288 and https://github.com/coreos/etcd/pull/8334.

    2. 3

      I guess I’m grouchy today, but is there a name for blog posts that just lazily rephrase the existing, excellent, and even interactive (!) literature (the-raft-website) to position oneself as an expert on the subject? This seems like that sort of post.

      1. 4

        This is actually something that I thought about when I was writing this. I think that this post does actually add to this in the form of the list of problems at the end - I haven’t seen anyone talking about those before (let me know if I missed something though!). Maybe I should have just done a blog post on that, instead of including the more basic stuff.

        Also - I think that this post covers stuff that isn’t on the raft website - they have a brief explanation about what consensus, a visualization, and the paper. I could be missing something, but I don’t see anything that is a more plain english explanation of how it works and why it works that way (although the paper is very close). I could be missing something though.

        1. 3

          I don’t really intend to hammer on this (or at least didn’t, I suppose I am at this point), but /u/Irene suggested I give more concrete feedback, so here goes:

          Problem: Latency) Latency is not “briefly mentioned in the paper”, it is dealt with precisely in the same section where you pulled the inequality you cite: “The broadcast time should be an order of magnitude less than the election timeout so that leaders can reliably send the heartbeat messages required to keep followers from starting elections; … The broadcast time and MTBF are properties of the underlying system, while the election timeout is something we must choose. Raft’s RPCs typically require the recipient to persist information to stable storage, so the broadcast time may range from 0.5ms to 20ms, depending on storage technology. As a result, the election timeout is likely to be somewhere between 10ms and 500ms.” In any case, configuring the election time is an issue with running an implementation of raft, not with writing one, so you’d refer to something like the etcd time parameters page (which, incidentally, by default uses exactly one order of magnitude difference between heartbeat time and election timeout – exactly as the paper advises)

          Problem: Sync) fsync has been reliably able to flush disk caches without modifying hdparm/sdparm since at least 2012, and while true: “in the real world, disk failures happen”, Raft is 100% ok with that. It’s not a problem.

          It feels like you were trying to brainstorm problems where there aren’t any, especially because, as you say, you’ve never implemented Raft.

          Otherwise, saying that a plain English explanation is missing seems disingenuous when we’re talking about an English language paper/algorithm that exists precisely to be (and is celebrated for being) easy to understand. It’s not a real problem anyone has.

        2. 3

          It’s probably not within most people’s understanding of spam, but it’s definitely self-promotion. I think there’s enough disagreement around where the lines are there that it’s helpful to explain why you didn’t find the post valuable, especially since the author is a lobste.rs user.