I never understood the people who build a reduced set of back ends for clang. The extra back ends are such a tiny portion of the total code that it makes almost no difference to compile times or to final binary size and it dramatically reduces the utility of the final binary.
Last time I measured build time difference between a build with all backends and a build with only the X86 backend, the difference was rather significant.
[4182/4182] Generating ../../bin/llvm-readelf
X86 backend only:
[3196/3196] Generating ../../bin/llvm-readelf
This was about one year ago, on Linux. More details here.
That’s not quite a fair comparison though, because now you can’t cross compile. If you need a cross-compiler for any architecture then you’re looking at another 8 minutes to build a version of LLVM that can target that one. Maybe I’m an outlier, but once I got used to a compiler that could target any architecture, I have found it incredibly limiting to go back to one that can’t. In particular, if I’m writing something performance critical, I often want to look at the generated assembly on 3-4 architectures to make sure that I’m not backing in any ISA-specific or ABI-specific assumptions.
The last line of your ninja output also reminds me that it’s not just clang, all of the other utilities use the back ends. Your first one, for example, generates an objdump that can disassemble only x86 binaries. If you want to look at an Arm binary, you need to install a separate version of objdump. Again, that’s something I find incredibly annoying - I can’t just run objdump, I need to remember to invoke the specific one for the target that I’m disassembling.
Did you mean that it doesn’t make a difference to the compile time of llvm/clang or when using clang (with all backends vs single backend) to compile other software?
It doesn’t make a significant difference to the compile time of llvm/clang, relative to the value of the features that you get in exchange. You can get a 100% reduction in the compile time of LLVM by not building any of it and then the built clang is useful for 0% of use cases. You get a 20% (actually surprised by that, it was much less last time I tried) reduction in LLVM’s build time by removing support for all cross-compile scenarios and any support for the other tools to interact with binaries for any other architectures (so no objdump, no readelf, and so on, for any binary that isn’t native to the host architecture). I consider that to be significantly more than a 20% reduction in features. Maybe I’m an outlier here.
Some platforms don’t have powerful build systems so what’s a small difference in compile time on a cpu-rich amd64 box may be a big difference on macppc or octeon, for example. How much utility is there in having an x86-64 backend available on an octeon machine?
Do you ever want to inspect an x86 binary on the octeon? If so, having an objdump that works there is useful. That’s a use case I hit pretty often.
If the machine is very slow, then I’d generally cross build from another machine. That means: