1. 15

  2. 5

    None of the utilities uses O_DIRECT, if they did there would an improvement in the performance of a few percent, he said. Larger I/O sizes would also improve things.

    I remember that happening in Windows. It was a mistake: https://web.archive.org/web/20160308075819/https://blogs.technet.microsoft.com/markrussinovich/2008/02/04/inside-vista-sp1-file-copy-improvements/

    There is no API for user space to call that knows how to copy everything about a file; Windows and macOS have that, though it is not the default.

    I’m not sure what “default” means here; Windows has CopyFile, although applications are free to use read and write if they want. There’s no such thing as a default when writing new code.

    I think right now there are three issues in copy performance that are in tension:

    1. Some devices, like single hard drives, really want sequential IO which implies doing things serially. On a single device, they want a pile of sequential reads, alternating to sequential writes, with as few switches as possible; on multiple devices sequential read and concurrent sequential write works best.
    2. Other things, particularly networks, benefit from pipelining and want multiple reads and writes outstanding at a time. This is particularly true if the source and destination are on different devices.
    3. File systems each have their own metadata, and depending on the specific file system may degrade when presented with random writes.

    That also means there’s a tension in (2) and (3) where the physical path to the file system benefits from one thing, but the file system endpoint benefits from something else, and the “optimal” solution ends up being more conservative than naively looking at IO sizes might predict.

    Things like RAID or SSDs are a little strange where they benefit from parallelism but depending on configuration can behave very differently. A mirror might want large concurrent reads, but a stripe might not. Parity wants a full stripe write, which can be a very unnatural size since it’s a function of the number of disks, not a power of 2. Etc.

    1. 1

      It’s mentioned in comments, but if you have “copy problem”, you should have a look at xcp. I will let the documentation speak for itself.