This is really an all-star team: Kaashoek, Chlipala, and Kohler, with Robert Morris and Butler Lampson in the thank-yous. They are all known to be serious wizards, but I’m skeptical that you can “formally describe the relationships between the behaviors of these different components [disks, bits, etc.] under crash conditions”, since that depends on what all the hardware components in the disk do as the electricity is going away, which is a thing that’s awfully hard to analyze, even if the design of the hardware components weren’t a Seagate trade secret.
On the other hand, if you cryptographically sign consistent Merkle-DAG snapshots of the filesystem using keys that aren’t stored on the disk, you can probably be sure that even malicious disk firmware can’t corrupt your data; it can only lose it. Probably if you want to be sure that no single disk failure can lose your data either, you need to write it to multiple disks. (Once you’re up to three disks, you could use N-of-M secret-sharing to store the keys on the disks without giving any single disk enough information to sign a fake snapshot.)
I’m about 50% sure that this comment is going to sound really dumb when I get to actually read the paper, because any of the five people I mentioned at the top of this comment is surely capable of thinking all of it up. This suggests that I’m missing their point.
Sure would be nice to have a #formalmethods tag to add to articles like this. Maybe #Coq too.
I would adore a formal methods tag, although I wonder whether it would have enough posts to be worthwhile. But perhaps it would attract some over time. :)
Anyway, yeah, I look forward to actually reading it, for that same reason. :)
I think we already have more than one formal-methods post per week, which is pretty stunning, really — we’re living in the future!
I don’t think the paper is online yet, but this seems to be the abstract.
Edit: A previous paper is online that outlines the general approach. That one is more of a position/discussion paper arguing: verifying crash safety for storage systems is important and possible, and here is an analysis of how to do it.
To me the biggest scariest thing is device failure, not software failure.
However - verified kernels, drivers, filesystems… these are awesome things! Abstractions are only truly freeing when you don’t have anything to worry about.
Right, but this is (based on mjn’s link to the abstract) not just verifying that the drivers don’t have software crashes - it’s verifying that the storage protocol they implement can’t lead to data inconsistency, even in the presence of hardware failure. That’s exciting because, while systems claiming that level of reliability exist, they’ve never been formally verified. It’s also fascinating for fans of formal verification because they must have some new proof-abstractions to encode what interesting hardware failures mean to the software, and to the state of the system.
How can software guarantee that a hardware failure doesn’t erroneously scribble all over a filesystem?
From their previous paper, it sounds like the specific kind of crash-proofing that they’re trying to verify is the kind that journaled filesystems claim, i.e. that the filesystem will retain integrity in the face of arbitrary crashes/restarts (including crashes during recovery from previous crashes). So they aim at ensuring that the transaction/recovery logic is sound and doesn’t have data-loss edge cases. Dealing with arbitrary hardware failure doesn’t seem to be in scope, but yeah, would hose any of these guarantees.
I’m hopeful that this can be extended to distributed filesystems by future work. :) It’s a really interesting step forward, regardless.
It can’t, but it can guarantee that there’s another copy somewhere, subject to assumptions about number of simultaneous failures.
Also, if the failure is a simple power loss, modern hardware actually behaves halfway decently, if instructed properly. It makes sure the drive head is never out of the locked position unless the discs are spinning, and if a failure happens, the mechanical energy from the discs themselves is used to get the head to safety. The failure mode where the drive head floats free and scratches everything badly, possibly while still magnetized, doesn’t happen anymore. Using that properly mostly means making sure writes happen in an order such that things can be reconstructed if one of them fails in the middle and the rest of the queue doesn’t happen.
In the event of a surge that overwhelms protections, those bets are off, of course, since arbitrary pieces of circuitry may suddenly start doing something other than what they were designed to.
The point is that it’s nice to be able to reason about what happens in each of these cases, and focus prevention efforts accordingly. And that’s a lot easier when you know that the behavior is really what you think it’s going to be.
Here’s a high-level overview and the PDF.