A good read but hardly surprising; Oracle has been doing this (bypassing FS & writing directly to disk) for many years.
Absolutely fascinsting. It feels like the divide between wanting things to be random access, ie like RAM, or sequential access, ie like tape drives or spinny hard disks. In theory they’re interchangable but in practice optimizing for one or the other has huge performance gains. And then you add error correction, metadata, databases, NUMA, latency vs throughput tradeoffs, and all those other things, and the Ceph people are trying to deal with all these things at once. A great peek into a very narrow but deep field.