1. 19

  2. 3

    Interesting, I have some questions:

    how does large N affect compile time? Do more complicated expressions get evaluated in the same way? Is this good for real programs, or is it a form of micro benchmark gaming via special cases in the front end?

    1. 4

      The particular series the author of the blog post is summing is an example of an arithmetic series. These series have a closed form solution that can be calculated in constant time, so presumably the compiler authors have added a way to decompose for loops into the formula for the sum of an arithmetic series, allowing them to inline for loops with a constant-time calculation at compile time.

      1. 2

        Do we know what form this optimization takes? For instance, is it an AST transformation, or just a boring old C macro?

        1. 2

          Well, we’d have to check LLVM for an answer to that. This optimization isn’t so much a Rust optimization as it is an LLVM one. The Rust magic is that the iterator code becomes LLVM IR that triggers the closed-form optimization. The closed-form optimization itself is all LLVM.

          1. 2

            I haven’t looked into this myself, but it’s claimed that rustc’s frontend is the one doing it here: https://twitter.com/CryZe107/status/769497660998950912

            1. 2

              Huh! I was wrong then. I didn’t realize Rust did this sort of optimization. Good to know!

    2. 2

      I timed compilation out of curiosity about how the tradeoff of compilation speed for runtime speed would work out:

      sh-3.2$ cat ./compile
      #!/usr/bin/env sh
      for i in {1..20}
          /usr/bin/env rustc -O ~/Desktop/sum.rs
      sh-3.2$ time ./compile
      real    0m5.043s
      user    0m3.790s
      sys     0m1.083s
      sh-3.2$ time ./sum
      real    0m0.011s
      user    0m0.002s
      sys     0m0.003s

      So the computation takes a sum total of about a quarter of a second on your average modern-day Intel processor. Not bad. Makes me wonder whether the compiler itself could be more efficient at reducing the loop…