You never really do. There are a lot of algorithms that are proven correct using one interpretation of the paper or text description, but are broken using another reasonable interpretation of the text. One great value of the statement of Raft in Coq or TLA+ is that it provides an very precise definition of what the protocol really is, which can be very useful to implementers who care about correctness and don’t want to misinterpret the paper.

Proving things like this has a few challenges (1) does the thing I have implemented match what other humans call Raft? (2) do the invariants I have specified (or constraints, or proof goal, etc) match what I am trying to prove (linearizability)? (3) is my proof correct?

Of these, (2) is the easiest, (3) is tractable with peer review, and (1) is pretty hard. The advantage of writing down Raft in TLA+ is that it’s much easier - Coq maps onto TLA+ much more deterministically than it does onto English. The TLA+ specification then is Raft, and the description in the paper is approximately, hand wavingly Raft.

Possibly dumb question, but how do they know the proof is correct/states the problem correctly?

You never really do. There are a lot of algorithms that are proven correct using one interpretation of the paper or text description, but are broken using another reasonable interpretation of the text. One great value of the statement of Raft in Coq or TLA+ is that it provides an very precise definition of what the protocol really is, which can be very useful to implementers who care about correctness and don’t want to misinterpret the paper.

Proving things like this has a few challenges (1) does the thing I have implemented match what other humans call Raft? (2) do the invariants I have specified (or constraints, or proof goal, etc) match what I am trying to prove (linearizability)? (3) is my proof correct?

Of these, (2) is the easiest, (3) is tractable with peer review, and (1) is pretty hard. The advantage of writing down Raft in TLA+ is that it’s much easier - Coq maps onto TLA+ much more deterministically than it does onto English. The TLA+ specification then

isRaft, and the description in the paper isapproximately, hand wavinglyRaft.Isn’t one point of Coq to replace peer review with computation to ensure (3)?

Not wholly replace, but generalize to peer review of Coq itself. The authors answered on HN explaining this better than I could.