1. 31
  1. 4

    Without TLS, both sides are presumably using a single sendfile syscall for the body of the request, so this is mostly a test of how fast the Linux kernel can transfer to/from the Ethernet interface. Although on the receiving end I guess it’s also writing to disk.

    In trying to optimize networking code over the years I’ve often been frustrated how much slower it runs than the hardware maximum, like orders of magnitude slower. That’s because it’s doing more work on one or both ends — database queries, encoding/parsing data formats, compression/decompression and so on. In some cases it’s worth “wasting” resources keeping a copy of the data in an easy-to-stream format.

    It’s also a cool use case for append-only logs — if you have a file structured that way, it can be served as-is over HTTP as a static file, and clients can use conditional range requests to sync with it extremely cheaply.

    1. 6

      Caddy is actually not currently using sendfile (see https://github.com/caddyserver/caddy/issues/4731), and still manages to saturate the 25 Gbit/s without trouble :)

      1. 2

        Yep, and it would not explain why the go client is slower. Maybe the net package in go std lib is also doing something suboptimal?

      2. 4

        Although on the receiving end I guess it’s also writing to disk.

        The tests write to /dev/null, so not really.

      3. 2

        Nice! If I might be so bold, I’d like to suggest that you could try varying which TLS cipher suite gets used. curl has a --ciphers option for example. Configuring the client to only allow one cipher suite should do it. In theory I think the newer ones like AES_GCM might be the fastest?

        1. 2

          curl is using “SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384”, so that should be fastest. I also tried generating an ECDSA key instead of an RSA key, but it made no performance difference.

          1. 4

            I dare you to poke it into a slower mode to see. ;) I would also be super interested to hear about ChaCha+Poly modes (which are apparently new and shiny but not hardware accelerated), CBC (which sucks) and anything with CTR in the name (which I think should be slower than GCM because the to use a separate MAC but IDK for certain).

            It does make sense that you don’t see throughput change when you switch between RSA and ECDSA. The asymmetric crypto is only used very briefly at the start of the connection. The two sides negotiate a new randomly generated key for symmetric encryption. Once that’s done, the bulk of the data is encrypted with the symmetric cryptosystem.

            1. 1

              What HTTP version? Can you force it to 1.1?

              1. 1

                Everything in my tests is on HTTP 1.1, yes.

                HTTP 2 interfered with KTLS sendfile in nginx for some reason.