The thing that I really like about dm-verity is the way the layers stack. If you want any confidentiality, you want dm-crypt, which encrypts blocks. An encryption layer usually spits out the cypher text and the tags. Without the tags, any bit pattern will decrypt to something, just not something sensible. The tags let you detect whether the encrypted block really is something encrypted with the specified key (and IV). The dm-crypt layer stores the cyphertext and makes the tags available to other layers. Then dm-integrity comes along and stores the tags. Now you can detect if an attacker has tampered with a block on the disk (important if they may have physical access temporarily or if you ‘disk’ is a VM disk image. If you aren’t using dm-crypt then you can just use a cryptographic hash function for this layer, but hashing and encrypting are such similar speeds on modern hardware so it’s unclear that this is a perf win. At this point, you’re still vulnerable to replay attacks. If, for example, an attacker takes a snapshot of your disk early and then later finds a kernel vulnerability then they can replace the blocks used for the kernel with older ones and exploit the clone of your image. Finally, dm-verity comes along and builds the Merkel tree over these hashes.
The thing I don’t like is that the block layer is totally the wrong place to build this kind of abstraction. Apple did it right with APFS sealed volumes. I think ZFS is almost the right shape to do it (but without metadata confidentiality, because adding that would be an incompatible change to the on disk format), but block level integrity leaks a phenomenal amount over side channels from the access patterns. This doesn’t matter for things like sealed boot volumes, but it does matter for disk images for confidential VMs (where the VM is encrypted in memory, the hypervisor can’t see it, but the untrusted cloud provider can monitor all I/O).
Extending dm-verity to be read-write would be incredibly hard because a single block update needs to update every layer in the Merkel tree and that must be atomic. You could do it with something like FreeBSD’s gjournal but that would make the write amplification much worse. You really want a CoW model so it’s hard to correlate writes (the pattern of blocks that’s written in a CoW filesystem is hard to map back to the set of files that was modified) and if you’re doing that then you can write the Merkel tree updates in your write-ahead log and periodically update the secure boot state with the latest head value that it should trust. If this is a confidential VM with a key release policy driven by an attestation report then you can just keep streaming the new root values asynchronously and just make sure that you only overwrite ones in your write rings where you have an ACK back.
The thing that I really like about dm-verity is the way the layers stack. If you want any confidentiality, you want dm-crypt, which encrypts blocks. An encryption layer usually spits out the cypher text and the tags. Without the tags, any bit pattern will decrypt to something, just not something sensible. The tags let you detect whether the encrypted block really is something encrypted with the specified key (and IV). The dm-crypt layer stores the cyphertext and makes the tags available to other layers. Then dm-integrity comes along and stores the tags. Now you can detect if an attacker has tampered with a block on the disk (important if they may have physical access temporarily or if you ‘disk’ is a VM disk image. If you aren’t using dm-crypt then you can just use a cryptographic hash function for this layer, but hashing and encrypting are such similar speeds on modern hardware so it’s unclear that this is a perf win. At this point, you’re still vulnerable to replay attacks. If, for example, an attacker takes a snapshot of your disk early and then later finds a kernel vulnerability then they can replace the blocks used for the kernel with older ones and exploit the clone of your image. Finally, dm-verity comes along and builds the Merkel tree over these hashes.
The thing I don’t like is that the block layer is totally the wrong place to build this kind of abstraction. Apple did it right with APFS sealed volumes. I think ZFS is almost the right shape to do it (but without metadata confidentiality, because adding that would be an incompatible change to the on disk format), but block level integrity leaks a phenomenal amount over side channels from the access patterns. This doesn’t matter for things like sealed boot volumes, but it does matter for disk images for confidential VMs (where the VM is encrypted in memory, the hypervisor can’t see it, but the untrusted cloud provider can monitor all I/O).
Extending dm-verity to be read-write would be incredibly hard because a single block update needs to update every layer in the Merkel tree and that must be atomic. You could do it with something like FreeBSD’s gjournal but that would make the write amplification much worse. You really want a CoW model so it’s hard to correlate writes (the pattern of blocks that’s written in a CoW filesystem is hard to map back to the set of files that was modified) and if you’re doing that then you can write the Merkel tree updates in your write-ahead log and periodically update the secure boot state with the latest head value that it should trust. If this is a confidential VM with a key release policy driven by an attestation report then you can just keep streaming the new root values asynchronously and just make sure that you only overwrite ones in your write rings where you have an ACK back.