1. 34
  1. 8

    A big problem with the benchmark method is that there are two very different code generators involved. In general I’ve found Visual C++ to be markedly inferior at producing optimised output when compared to LLVM (or to GCC). A better comparison should use clang++. Right now the comparison is less about the language and more about the backend/optimiser framework.

    Edit: Blergh, I should’ve read to the end… I suspect using (not-really-standard-C++) __restrict qualifier appropriately might make the difference, i.e. Rust can assume the alternative references don’t alias perhaps?

    1. 7

      I suspect using (not-really-standard-C++) __restrict qualifier appropriately might make the difference, i.e. Rust can assume the alternative references don’t alias perhaps?

      In theory, yes.

      In practice, no-aliasing optimizations in Rust are currently disabled because of the bugs in LLVM that have been run into when emitting IR using noalias: https://github.com/rust-lang/rust/issues/54878. As it turns out, when your language design enables widespread use of an optimization that other languages on the backend can’t easily take advantage of, you get to find all the bugs they didn’t.

    2. 7

      As pointed out in the matching thread on an orange site, Rust does not have a defined calling convention, so the compiler can pass things however the heck it thinks is best. You should do it the way that expresses what you mean. If the compiler is derpy about choosing the best representation for that, that’s an implementation problem.

      1. 4

        One other benefit to sending in copies instead of references is that you can avoid having to push the struct on the stack, and simply pass it in using the registers. C++ compilers are usually pretty good at this, (i haven’t tried rust). This means even without inlining (like across two different translation units), you could still see some performance gains.

        1. 3

          Yeah. Honestly, since structs are passed in registers and avoid the memory hit (granting that it’ll likely be in cache), it can be significantly faster than a reference pass for relatively larger structs than merely 24 bytes.