As a bit of a counter-argument to this, here is how long it takes to compile the latest Vim on my system*:
$ time CC=gcc ./configure
CC=gcc ./configure 8.18s user 4.89s system 98% cpu 13.312 total
$ time make
make 135.47s user 5.84s system 98% cpu 2:22.76 total
$ time CC=gcc CFLAGS=-O0 ./configure
CC=gcc CFLAGS=-O0 ./configure 7.55s user 4.82s system 98% cpu 12.599 total
$ time make
make 30.14s user 3.79s system 97% cpu 34.648 total
$ time CC=clang CFLAGS=-O0 ./configure
CC=clang CFLAGS=-O0 ./configure 19.21s user 8.10s system 98% cpu 27.831 total
$ time make
make 37.16s user 4.04s system 98% cpu 41.756 total
$ time CC=tcc ./configure
CC=tcc ./configure 2.32s user 1.97s system 97% cpu 4.391 total
$ time make
make 6.47s user 1.94s system 98% cpu 8.543 total
So gcc with the common default of -O2 takes almost 2.5 minutes, -O0 still takes 30 seconds, and tcc takes 8.5 seconds.
In other words, all this inlining and whatnot is certainly nice, but also isn’t free: you pay for it in compilation time. The Go compiler was quite a bit faster a few years ago, the reason it’s comparatively slower now is because it does more optimisations (it’s still very fast compared to most other compilers, just less fast than it used to be).
You also see this with clang: when it was first released compiling was a lot faster than gcc, but it also did fewer optimisations. Now that they’ve added many more advanced optimisations it actually seems slower than gcc (at least in this one particular case).
tcc, on the other hand, does no optimisations at all and is 4x faster than gcc even with all optimisations disabled (not sure if -O0 disables all optimisations in gcc?), which is why I like using tcc for development.
Another aspect where you “pay” for it is in codebase complexity and maintenance. See e.g. clang-11.0.0 miscompiles SQLite for a recently example.
Not saying Go can’t or shouldn’t continue to improve in this regard; but I do think it’s a bit more complex than outlined here. Like many aspects of designing a language/compiler there are a lot of trade-offs to be made.
*: run make clean distclean between the runs if you want to reproduce, also my laptop cheap and slow, so you’ll probably get much faster results. It still shows the comparative differences quite clearly though.
This is my thought. Golang has explicitly chosen compile time performance over runtime performance, which is a perfectly reasonable trade-off. Golang isn’t trying to be the language that generates the most efficient binaries. I’m always surprised when people complain that Golang isn’t a fundamentally different language.
As a bit of a counter-argument to this, here is how long it takes to compile the latest Vim on my system*:
So
gcc
with the common default of-O2
takes almost 2.5 minutes,-O0
still takes 30 seconds, andtcc
takes 8.5 seconds.In other words, all this inlining and whatnot is certainly nice, but also isn’t free: you pay for it in compilation time. The Go compiler was quite a bit faster a few years ago, the reason it’s comparatively slower now is because it does more optimisations (it’s still very fast compared to most other compilers, just less fast than it used to be).
You also see this with clang: when it was first released compiling was a lot faster than gcc, but it also did fewer optimisations. Now that they’ve added many more advanced optimisations it actually seems slower than gcc (at least in this one particular case).
tcc
, on the other hand, does no optimisations at all and is 4x faster thangcc
even with all optimisations disabled (not sure if-O0
disables all optimisations in gcc?), which is why I like usingtcc
for development.Another aspect where you “pay” for it is in codebase complexity and maintenance. See e.g. clang-11.0.0 miscompiles SQLite for a recently example.
Not saying Go can’t or shouldn’t continue to improve in this regard; but I do think it’s a bit more complex than outlined here. Like many aspects of designing a language/compiler there are a lot of trade-offs to be made.
*: run
make clean distclean
between the runs if you want to reproduce, also my laptop cheap and slow, so you’ll probably get much faster results. It still shows the comparative differences quite clearly though.This is my thought. Golang has explicitly chosen compile time performance over runtime performance, which is a perfectly reasonable trade-off. Golang isn’t trying to be the language that generates the most efficient binaries. I’m always surprised when people complain that Golang isn’t a fundamentally different language.
If you compile the given example with “-gcflags=-m=2” you’ll see the reason why
sum
is not inlined:Rewriting “range” with “for” yields:
So it seems the Go compiler doesn’t inline functions that contain loops.