Apparently we’re not supposed to TL;DR in story text, so here’s a comment:
Signal using Intel SGX to allow for secure backups in case you lose your device. Password-based encryption works, but offline dictionary attacks are problematic. So, password authentication into an enclave, with few attempts. However, you want to replicate the enclave without allowing parallel password guesses. So build a consensus protocol out of SGX attestation operations.
This is really cool, well-developed applied crypto research. Only major concern is how much it relies on SGX, which has been broken seven ways to Sunday.
I am surprised by how much the folks at Signal trust SGX. Frankly I don’t understand it. I don’t believe that it’s been maliciously compromised — I just don’t believe that SGX is bug-free in both design and implementation.
Likewise, this secure value recovery proposal sounds neat and great, but also really complex. That complexity means lots of opportunity to fail. One of those clear failure points is SGX: if that fails then I believe the entire system fails completely.
A comment I read a few years ago stated that Moxie (at least) had absolutely no trust in the government but had no issue with large companies, making him actually typical American in this regard. I don’t like classifying people that way andI didn’t expect such a thing but it matches the reality pretty well: trying to protect against government spying but trust in Intel, in Intel SGX (*), cloud providers (Amazon, Google, Microsoft iirc), …
(*) Trusting Intel SGX and publishing an article only a few days after https://pludervolt.com was announced is so unrealistic that it’s actually almost funny.
Non-typo’d version of that link: https://plundervolt.com/
It’s funnier when you consider Intel operates in a police state that can secretly compel backdoors and targeted surveillance. They’re also one of the most likely to be cooperating with that. That said, they currently use the secret capabilities for fewer targets than the broader, more-public enforcement goes after.
That’s what I told Moxie. He also didn’t seem to know why there’s a preference for tamper-resistant HSM’s in these use cases.
IIRC he also strongly trusted google, choosing to only work with GCM and nothing more. It took a huge amount of input from users and no less than 2 forks for him to finally support other means of message delivery that don’t rely on google.
(I posted this to the HN thread but was late and missed the window of getting insight, so sorry for the x-post from HN)
I’m very puzzled by the consensus group load balancing section. The article emphasizes correctness of the Raft algorithm was super important (to the point that they skipped clear optimizations!!11), but, then immediately follows up with (as far as I can tell) a load-balancer wrapper approach for rebalancing and scaling. My “this feels like consensus bug city” detectors immediately went off.
Consensus algorithms (including Raft and Paxos) are notoriously picky and hard to get right around cluster membership changes. If you try to end run around this by sharding to different clusters with a simple traffic director to choose which cluster, how does the traffic director achieve consensus with the clusters that the traffic is going to the right cluster? You haven’t solved any consensus problem, you’ve just moved it to your load balancers.
A solution for this problem (to agree on which cluster the data is owned by) is 2-phase commit on top of the consensus clusters. It didn’t appear from the diagrams that that’s what they did here, so either I missed something, or this wouldn’t pass a Jepsen test.
Did I miss something?
(If you did build 2PC on top of these consensus clusters, you’d have built a significant portion of Spanner’s architecture inside of a secure enclave. That’s hilarious.)