1. 48

  2. 9

    I can’t say a 10% improvement is making LLVM fast again, we would need a 10x improvement for it to deserve that label. But it’s a start…

    It’s a shame, one of the standout feature of llvm/clang used to be that it was faster than GCC. Today, an optimized build with gcc is slower than a debug build with clang. I don’t know if a 10x improvement is at all feasible, though; tcc is between 10-20x faster than gcc and clang, and part of the reason is that it does a lot less. The architecture of such a compiler may by necessity be too generic.

    Here’s a table listing build times for my project with and without optimizations with gcc, clang, and tcc. Tcc w/optimizations shown only for completeness because it does some; the time isn’t appreciably different. 20 runs each.

    │                             │Clang -O2 │Clang -O0 │GCC -O2   │GCC -O0  │TCC -O2     │TCC -O0     │
    │Average time (s)             │1.49 ±0.11│1.24 ±0.08│1.06 ±0.08│0.8 ±0.04│0.072 ±0.011│0.072 ±0.014│
    │Speedup compared to clang -O2│        - │     1.20 │     1.40 │    1.86 │      20.59 │      20.69 │
    │Slowdown compared to TCC     │    20.68 │    17.20 │    17.72 │   11.12 │          - │          - │
    1. 4

      It makes sense that as clang/llvm added the optimization passes and codegen to approach GCC’s codegen quality that the compile-time performance would converge.

      1. 4

        I wonder how much of it is optimizations and such, and how much is true “bloat” due to trying to be a generic “compiler framework” with many frontends and backends necessitating more abstractions, just like GCC. TCC is just a C compiler, and nothing else. This might also be an important reason it’s much more nimble.

      2. 1

        That quote is a great example of why people should talk about performance improvements with ratios rather than relative multipliers. Does 10x 10% mean 100% (0 runtime), 0.9 ^ 10 (0.34 of the original runtime), or what?

      3. 14

        I have to admit – as someone who lives in the USA, the “Make ___ ___ Again” formulation is super unpleasant in basically every context.

        1. 8

          Yes please. Track compile time and memory usage. Revert offending commits. Don’t let them bloat LLVM.

          I do remember how fast clang was.

          1. 8

            By “bloat LLVM” do you mean “add optimization passes to make the generated code faster”?

            1. 7

              You might have overlooked the table with different optimization levels and their effect on compilation time.

              I’ve been following clang for a while and I assure you, the generated code hasn’t got that much faster, for the exponential growth in cpu time and ram usage during compilation, despite several hardware generations should have made compilation faster.

              1. 4

                The blog post explicitly calls out that some of the optimisations causing these slowdowns are of marginal benefit.

            2. 6

              It’s great to have this info, and I hope the LLVM project will pay attention to it.

              1. 1

                The comparisons were always a bit unfair. They were comparing clang to Apple GCC. I’ve no idea why Apple GCC was so slow, but the same version of FSF GCC in a FreeBSD VM was more than twice the speed of Apple GCC on the host. FSF GCC from MacPorts was similar to the FreeBSD version, so it wasn’t just OS X process creation overhead (though that was some of it). I always hated that page, because the baseline was unfair (’look, we hacked on GCC to make it really slow and now we have something faster!).

                To me, the advantages of LLVM have always been related to the approachable and modular / reusable codebase (and a license that makes it easy to take advantage of this modularity), not anything to do with performance.

                1. 1

                  Apple are using the GPLv2 versions of everything from GNU (that is, forks circa 2007).

                  1. 2

                    Not really. Apple’s GCC was originally a fork and was then moved into the FSF repo, but it contained a load of changes that were never merged into the main FSF branch. FSF GCC 4.2.1 and Apple GCC 4.2.1 are very different and Apple’s GCC 4.2.1 (which was used as the baseline for these comparisons) was much slower.

                    The page that I am talking about was created in 2008 (and was removed a year or two ago), so a 2007 version of GCC was not particularly old at the time. I don’t object to using GCC 4.2.1 as a comparison, I object to using Apple’s (significantly slower) fork as the baseline.