1. 4
  1.  

  2. 7

    As a bit of a counter-argument to this, here is how long it takes to compile the latest Vim on my system*:

    $ time CC=gcc ./configure
    CC=gcc ./configure  8.18s user 4.89s system 98% cpu 13.312 total
    $ time make
    make  135.47s user 5.84s system 98% cpu 2:22.76 total
    
    $ time CC=gcc CFLAGS=-O0 ./configure
    CC=gcc CFLAGS=-O0 ./configure  7.55s user 4.82s system 98% cpu 12.599 total
    $ time make
    make  30.14s user 3.79s system 97% cpu 34.648 total
    
    $ time CC=clang CFLAGS=-O0 ./configure
    CC=clang CFLAGS=-O0 ./configure  19.21s user 8.10s system 98% cpu 27.831 total
    $ time make
    make  37.16s user 4.04s system 98% cpu 41.756 total
    
    $ time CC=tcc ./configure
    CC=tcc ./configure  2.32s user 1.97s system 97% cpu 4.391 total
    $ time make
    make  6.47s user 1.94s system 98% cpu 8.543 total
    

    So gcc with the common default of -O2 takes almost 2.5 minutes, -O0 still takes 30 seconds, and tcc takes 8.5 seconds.

    In other words, all this inlining and whatnot is certainly nice, but also isn’t free: you pay for it in compilation time. The Go compiler was quite a bit faster a few years ago, the reason it’s comparatively slower now is because it does more optimisations (it’s still very fast compared to most other compilers, just less fast than it used to be).

    You also see this with clang: when it was first released compiling was a lot faster than gcc, but it also did fewer optimisations. Now that they’ve added many more advanced optimisations it actually seems slower than gcc (at least in this one particular case).

    tcc, on the other hand, does no optimisations at all and is 4x faster than gcc even with all optimisations disabled (not sure if -O0 disables all optimisations in gcc?), which is why I like using tcc for development.

    Another aspect where you “pay” for it is in codebase complexity and maintenance. See e.g. clang-11.0.0 miscompiles SQLite for a recently example.

    Not saying Go can’t or shouldn’t continue to improve in this regard; but I do think it’s a bit more complex than outlined here. Like many aspects of designing a language/compiler there are a lot of trade-offs to be made.


    *: run make clean distclean between the runs if you want to reproduce, also my laptop cheap and slow, so you’ll probably get much faster results. It still shows the comparative differences quite clearly though.

    1. 2

      This is my thought. Golang has explicitly chosen compile time performance over runtime performance, which is a perfectly reasonable trade-off. Golang isn’t trying to be the language that generates the most efficient binaries. I’m always surprised when people complain that Golang isn’t a fundamentally different language.

    2. 2

      If you compile the given example with “-gcflags=-m=2” you’ll see the reason why sum is not inlined:

      cannot inline sum: unhandled op RANGE
      

      Rewriting “range” with “for” yields:

      cannot inline sum: unhandled op FOR
      

      So it seems the Go compiler doesn’t inline functions that contain loops.