1. 45

  2. 3

    The user-perceptible performance improvement when using F2FS is interesting. I have to wonder if other filesystems could do the same thing: it seems to me that many filesystems already contain some built-in transaction mechanism in order to avoid corruption of filesytem metadata.

    I wonder how often you can implement barriers (as opposed to flushes) wholly in userspace, by writing down SHA256 hashes of previously-written data?

    1. 1

      The last point is very interesting. This is the filesystem equivalent of MADV_FREE, which is very useful for malloc implementations. It marks a memory range as containing data that the application doesn’t actually care about and that should either be preserved or replaced with zeroes. The kernel just clears the dirty bit on the relevant pages when it’s set and can then remove those pages from the process and replace them with CoW copies of a zero page when it encounters memory pressure. I wonder if the two could be combined so that MADV_FREE on an mmap‘d file would lazily punch a hole in the file and allow the OS to provide zero pages for reads and to reuse the space if it needs to (or at least avoid read-modify-writes for writes of less than a page). This would be a useful facility in general: currently MADV_FREE doesn’t work on shared memory segments / memory-mapped files, which is unfortunate when you want to have a shared heap.

      I think my main take-home from the rest of this article is that POSIX filesystem semantics are not actually very useful for modern workloads.