    RE: page vs log, this distinction is getting blurred with new SSD-focused techniques such as the ones used in BW-Tree. The idea is, random reads are fast with SSD’s, so we can scatter-gather reads across log-structured storage to reconstruct a logical page from fragments. This also means we don’t have to rewrite entire pages at a time for tiny writes.

      Don’t most SSDs require rewriting data at the sector size anyways (which is what a page often turns out to be)? In a COW filesystem like ZFS, it ends up doing the recordsize (configurable) anyways + metadata, I don’t know what this means for the future of drives like Optane.

        Yes, but for systems like the BW tree, that entire sector could be filled with new updates for different pages.

      Fun fact: log-structured file systems were originally designed to improve write performance of hard drives. They later turned out to be great for flash-based media as well as the garbage collection always frees and re-uses large blocks at once.