1. 22
    1. 5

      It’s great to see someone writing this up, but a few comments:

      • The usual extension for preprocessed C is .mii, not .tu. Using this will let syntax highlighting work correctly.
      • The first figure with the pipeline is missing the assembler step. In clang, this is integrated, but historically it is not (and I think GCC keeps it separate).
      • Having the compiler and driver in a single binary and the linker in a separate one is historical: clang needed a gcc-compatible driver and already had a load of the argument-parsing machinery. There was an effort for a while to provide a universal compiler driver in the LLVM project, which would probably have been cleaner (to drive clang, flang, and so on), but it was abandoned.
      • cc is not just convenient, it is part of POSIX.
      • GNU Binutils is available on all of the listed platforms, but it isn’t the default on more than half of them.
      • The multi-file section is very *NIX-specific. Windows build systems typically make the opposite tradeoff (invoking the compiler with multiple translation units) because process-creation costs are higher (and, I think, the visual studio compiler will do some transparent sharing of common include processing if you do).
      • The authors of lld and mold might be surprised to learn that linking cannot be parallelised. The authors of LINK.EXE and mold would be surprised to learn that you need a full relink from scratch if a single input file changes.
      • The language detection section contradicts itself. In this example, the compiler (not the driver) is setting up the default search paths.
      1. 4

        The usual extension for preprocessed C is .mii, not .tu.

        GCC will use .i (traditionally for C) and .ii (for C++). .mii is used for Objective-C++.

        The first figure with the pipeline is missing the assembler step. In clang, this is integrated, but historically it is not (and I think GCC keeps it separate).

        Yeah, it’s separate in GCC. GCC’s code generator outputs assembly and then invokes as. This is actually a bit of a pain sometimes because not only do you need an assembler, but there are cases you won’t know the size of an instruction during codegen because the assembler might be able to change it.

        1. 4

          GCC will use .i (traditionally for C) and .ii (for C++). .mii is used for Objective-C++.

          You’re right. I tend to default to .mii because then it doesn’t matter whether the input it C, C++, Objective-C, or C++ for syntax highlighting to work.

          GCC’s code generator outputs assembly and then invokes as. This is actually a bit of a pain sometimes because not only do you need an assembler, but there are cases you won’t know the size of an instruction during codegen because the assembler might be able to change it.

          This is even more true with the Plan 9 toolchain (which Go uses), which expands pseudos that contain relocations at link time. Depending on the distance / address of the target, you may need 1-3 instructions on modern architectures to materialise the address and the Plan 9 linker picks the shorter sequence. RISC-V tries to do the inverse and emit the inefficient sequence in the compiler and then ‘relax’ it back by deleting instructions and updating all other label addresses, which causes a huge amount more (complex) work in the linker than any other modern architecture / ABI.

          1. 1

            RISC-V tries to do the inverse and emit the inefficient sequence in the compiler and then ‘relax’ it back by deleting instructions and updating all other label addresses, which causes a huge amount more (complex) work in the linker than any other modern architecture / ABI.

            We have examined this kind of approach in our linker work and it really seems RISC-V made a mistake here.

            1. 3

              it really seems RISC-V made a mistake here.

              This statement works in almost any context.

        2. 1

          Yes, .i can be found un the 7th Edition cc source

      2. 2

        Nit: c99 is POSIX, not cc.

        About parallel linking, that was in reference to using the historical linker, and lld and mold etc are mentioned later on.

        1. 2

          Ah, you’re right. I’m fairly sure cc was in POSIX 1997 but I can’t work out how to search that version.

          1. 1

            It was, but deprecated in favour of the c89 command. Putting the language revision in the command name seems like a mistake… https://pubs.opengroup.org/onlinepubs/007908799/xcu/cc.html

            1. 1

              Putting it in the name made some sense because you could detect c89 or c99 support by just checking for the file. C99 also introduced some breaking changes and so you generally didn’t want cc to compile with an unspecified dialect because that would break either old or new code.

      3. 1

        The authors of lld and mold might be surprised to learn that linking cannot be parallelised. The authors of LINK.EXE and mold would be surprised to learn that you need a full relink from scratch if a single input file changes.

        I interpreted that part to mean that the end-result of the linking operation is a single entity, as opposed to compilation, where every compilation unit can be done in parallel independently of one another.

        So it’s possible to link in parallel, but you still need to aggregate everything together in a single linked artifact. (This also seems to be strongly implied by the diagram for the linker chapter)

        1. 2

          That’s kind-of true, but there’s a lot of nuance. Most compilers do some form of separate compilation, but there’s a case to be made (especially on modern hardware) for doing whole-program compilation and some languages do. Modern C/C++ compiler have an option to do this, though they still build IR for translation units one at a time. Linking involves several logical steps that broadly fit into two categories:

          • Resolving symbols
          • Copying (the sections that contain) referenced symbols into the resulting output.

          Both of these can be done somewhat incrementally and this is actually what mold does: it starts running before all object code is available. Resolving symbols can be done as soon as the symbol definition is available and sections can be copied into the output eagerly if there is space reserved for them.

          1. 1

            but there’s a case to be made (especially on modern hardware) for doing whole-program compilation and some languages do.

            Absolutely. We are trying to get our customers to build with LTO on embedded projects because they are small enough that modern hardware can actually do the complex whole-program optimizations that were mostly dreamed about back in the 80s and 90s. (Although in practice, all you really need for the big benefit is to be able to get the whole call-graph into memory so you can inline effectively.) The largest programs for most customers will be maybe 100 compilation units with a total program size of about 1MB, with roughly 1000 functions. With current workstation hardware, that will easily fit in memory and can be analyzed quickly.

    2. 2

      Glad to see this covered – I also had the experience of picking it up the hard way, over many years!

      My own tips after writing a custom build system for C++

      • the order of objects and -l flags to the linker matters! I remember being very surprised / frustrated by this.

      • Sanitizers are built into compilers and trivial to use. Learn to use AddressSanitizer simply with -fsanitize=address! Ironically I think many people don’t use it because their build system doesn’t have good build variants (dbg, opt, asan), or they don’t know how to configure the build system. Plain make generally isn’t good enough.

      • Some flags have to be passed to both the compiler and link steps, and others don’t. I mostly figured this out by trial and error, and the error messages aren’t great.

      • You can compile and link in one driver invocation (c++ -o), or you can build each object separately and link (c++ -c).

      I thought the former might be faster, and it seems simpler, but there doesn’t seem to be any real advantage (edit: this post explains why – it’s literally subprocessing, which you can do better from a shell or build system). The latter is more common because it supports parallel and incremental builds.

      Some options, I think -ftime-trace for Clang, which outputs JSON compile time traces, don’t even respect the first style of building.

      • Spending some time with a plain shell script and the compiler isn’t a bad way to learn. Now I can finally read all those crappy long error commands from big build systems. The most common and useful flags are -I to add to the #include path and -D to define a preprocessor symbol.

      edit after skimming the whole thing: This is really excellent, should be titled “Compilers: The Missing Manual”.

      I have actually looked at the manuals, e.g. https://gcc.gnu.org/onlinedocs/gcc-13.1.0/gcc/Invoking-GCC.h… but they seem to be missing the high level conceptual overview.

      They also seem to be missing the name “driver”, which is important, even though I have encountered a page about that before. It seems to be in a separate “GCC Internals” doc:

      https://gcc.gnu.org/onlinedocs/gcc-4.3.2/gccint/Driver.html#

      Other notes: I should have been using the -v flag to the driver all along! That’s a little embarrassing.

      Also it’s good to realize that g++ and gcc are both drivers, and the former sets the -I path to the location of the C++ stdlib and so forth.

      Looks like lots of great examples of ‘readelf’ as well, which I’ll go over again.

      It is kind of crazy how people generally pick this up piecemeal over so many years … A big problem in my mind is that it’s usually wrapped in GNU make or CMake or IDE configs, which add their own line noise on top of the raw driver invocations. Which in turn have a ton of logic before the actual tools are invoked.

      (copy of HN comment - https://news.ycombinator.com/item?id=35806237)