1. 14

  2. 6

    Filesystem error handling seems to have improved. Reporting an error on a pwrite if the block device reports an error is perhaps the most basic error propagation a robust filesystem should do; few filesystems reported that error correctly in 2005. Today, most filesystems will correctly report an error in the simplest possible error condition that doesn’t involve the entire drive being dead.

    Remember, this is considered an Improvement with capital I.

    Tbf, the engineers behind all these filesystems probably spent hours of work to make sure that the situation improves as much as possible without breaking everything in the process…

    Relatedly, it appears that apfs doesn’t checksum data because “[apfs] engineers contend that Apple devices basically don’t return bogus data”.

    I retract my previous statement.

    The initial table at the top is very encouraging. I do hope Btrfs improves (or Bcachefs somehow gets mainlined and does it’s job perfectly) and that ext4 gets more robust with data checksumming. Probably things I’ll put on the list “Things I want in 2019”

    1. 2

      “Appendix: why wasn’t this done earlier?”

      This section and similar writeups are worth a lot of thought into how to fix them. They’re huge problems of politics and focus in both funding authorities and universities. They’re also a huge source of missed opportunities in innovation and wasted productivity. Worth plenty of effort at creative solutions to fix them. The article notes one successful method of self-sacrifice on academics’ part to produce something they can turn into a business. Another is tech transfer of actual I.P. at universities that allow that. I’m not sure if they take royalties or what when it’s not a well-established business.

      Anyway, we need more solutions for those problems so that useful work such as Jepsen become common instead of the rare counterexample to how CompSci academia works.