1. 2
  1.  

  2. 1

    I can add another use case for this. In our CHERI embedded platform, we use a slightly modified version of -r that also does COMDAT merging for linking individual compartments. If the LLD code were a bit more cleanly structured, we’d also like to resolve a load of relocations because any relocation that is the delta of two symbols in the same section can also be resolved at this point, but unfortunately the current code structure makes this hard. At the same time, we make all symbols hidden except for cross-compartment exports. This lets us validate that the only symbols visible from a compartment binary are in the export table and we can then audit compartment isolation at static link time for a firmware image.

    Doing this made me really want to aggressively refactor LLD. The code is nowhere near the quality of the rest of LLVM. All state is held in one massive structure that is global, most of the control flow is if statements on booleans in the structure. It’s incredible difficult to make even fairly small changes.

    1. 1

      CHERI

      The current code structure which interleaves if (!config->relocatable) checks make a large number of misc features working with -r, e.g. –wrap, –compress-debug-sections, –build-id, -Map, etc. mold uses a side implementation for -r and these features are all missing. Yesterday I reported https://sourceware.org/bugzilla/show_bug.cgi?id=29820 (TLS_MODULE_BASE for x86) and noticed a similar issue with ld.lld RISC-V __global_pointer$. Using a separate implementation will easily miss these corner cases. (It’s true that adding any feature forces the developer to think about whether it works with -r. I believe most people don’t do, but I’ll try catching this in reviews.)

      1. 1

        Doing this made me really want to aggressively refactor LLD. The code is nowhere near the quality of the rest of LLVM. All state is held in one massive structure that is global, most of the control flow is if statements on booleans in the structure. It’s incredible difficult to make even fairly small changes.

        Hmm, I think the quality of LLD is actually higher than most parts of LLVM (bug rate, readability, test coverage (more than 90%), etc) as I often read random pieces of LLVM for learning purposes.

        It would be nice if LLD did not use config, target. Now that it did, it certainly makes some people frown upon, but I don’t how largely the fact negatively impacts users. Linking is a slow process and requires careful scheduling. I haven’t been convinced of a main use case that calling library code (a side produce of LLVM’s modulized design is that nearly everything is in a public header as long as there are cross-LLVM-component references) is better than spawning processes. I think Windows users have mentioned that spawning processes can be slow, but the benefit has IMHO a very low priority, especially in a non-Windows environment.

        Structural improvement is still appreciated. I cannot find of major ways to improve it, though;-)

        1. 1

          The big problem that I have with LLD’s code is the lack of modularity. Clang is a compiler, but I can easily reuse the parser for static analysis or for syntax highlighting, indexing, and autocompletion. Hal has modified it to JIT C++ templates with very few code changes. LLVM’s optimisation infrastructure is incredibly modular and allows creating very different transform and analysis pipelines. In the past, we’ve used it to do dynamic analysis and simulation of sandboxing policies, others have used it for a myriad of different things. The LLVM back ends are intended for code generation but folks have been able to reuse them for binary translation, formal verification, and so on.

          In contrast, LLD code is usable as a linker. It duplicates code in back ends. The back ends statically apply relocations that are computable at compile time (e.g. the difference between symbol addresses in the same section). There’s no reason that this couldn’t be a reusable library that both the linker and compiler could use. In addition, the way that LLD applies relocations is incredibly error prone. Every relocation is of the same structure: write some value into either a full word of some length or into the immediate field of some instruction. The back ends all provide descriptions of the structures of instructions but, rather than use those, the relocation processing in LLD does manual bit twiddling. It doesn’t even have template helpers for ‘write an immediate encoded in these bit ranges’, which would significantly simplify the code and make it easier to review. Instead, everything is masks and shifts with magic numbers.

          The comment ratio is spectacularly low. Most of LLVM has doc comments on each class and field. Most of LLD is completely undocumented. For example, when I added a new relocation, I had to add it to three separate enum definitions. Why? There’s some abstraction in LLD that might make sense, but since it isn’t documented anywhere I doubt anyone except you or Rui knows why.

          There is no separation of concerns. Everything assumes a specific linker flow. Trying to modify it to do anything else involves touching a load of different places. BFD ld provides built-in linker scripts to define the default behaviour for a platform. LLD doesn’t, and hard-codes this behaviour instead, but there is no target abstraction layer for defining this behaviour, just a load of if statements scattered through the code with things like ‘if this is MIPS, do this, unless it’s FreeBSD, then do this’. This makes adding a target that wants to do something that isn’t almost identical to an existing one very hard.

          To give a concrete example of where I’ve given up trying to use LLD: FreeBSD kernel modules are .o files and the loader does static linking because kernel modules are loaded zero or one times on any running system and so they don’t need to pay the overheads for shared libraries. I’d like to use C++ in the kernel, but the loader can’t handle things like COMDAT merging (and I don’t want to add that code to ring 0). I wanted to write a tool that would take a load of .o files, identify common COMDATs, and emit those as separate kernel modules and add those to the dependency list for the original set. 90% of the code that I would need to write such a tool exists in LLD but not in a state that is in any way extractable.

          It’s not as bad as the MIPS back end, but having worked on clang, LLVM optimisations, target-agnostic code generation, a couple of in-tree and a couple of out-of-tree back ends, libc++, compiler-rt, and LLD, LLD is by far my least favourite bit of the codebase to touch. Readability is subjective, but having dropped into most of the LLVM code, the only place where it’s taken me longer to figure out what’s going on than LLD was LLDB, and it has the excuse that it was an Apple project written with Apple coding conventions and then gradually moved to LLVM conventions.

          1. 1

            Thanks for the genuine comments.

            The big problem that I have with LLD’s code is the lack of modularity. Clang is a compiler, but I can easily reuse the parser for static analysis or for syntax highlighting, indexing, and autocompletion. Hal has modified it to JIT C++ templates with very few code changes. LLVM’s optimisation infrastructure is incredibly modular and allows creating very different transform and analysis pipelines. In the past, we’ve used it to do dynamic analysis and simulation of sandboxing policies, others have used it for a myriad of different things. The LLVM back ends are intended for code generation but folks have been able to reuse them for binary translation, formal verification, and so on.

            I perceive this differently in practice… I have used Clang for syntax highlighting, indexing, and auto-completion in my language server (ccls. My involvment of llvm-project started with improving a good cross-reference tool to facilitate code reading). These are mostly end-to-end tasks except that they skip code generation. If we look into the involved individual modules, the lexing, parsing, and semantic analysis components of Clang are actually difficult to be plugged into a downstream project. These internal states are just difficult to be serialized. A fair comparison for the linker is probably to feed input and get a linker map, or a random statistics file, which can be easily done with lld. (There is a reason that the C interface libclang existed and for a long time was used a lot of IDE helpers before modern LSP servers took over.)

            Does lld provide early exit for these specific tasks? No, but it’s rather straightforward if one wants to. I think these uses have demonstrated significant high priority to affect the linker architectural design. For sure, doing such work cannot sabotage the linker performance goal as the performance appears to be the most critical metric for most people (after that the other critical goals such as robustness have been achieved).

            In contrast, LLD code is usable as a linker. It duplicates code in back ends. The back ends statically apply relocations that are computable at compile time (e.g. the difference between symbol addresses in the same section). There’s no reason that this couldn’t be a reusable library that both the linker and compiler could use. In addition, the way that LLD applies relocations is incredibly error prone. Every relocation is of the same structure: write some value into either a full word of some length or into the immediate field of some instruction. The back ends all provide descriptions of the structures of instructions but, rather than use those, the relocation processing in LLD does manual bit twiddling. It doesn’t even have template helpers for ‘write an immediate encoded in these bit ranges’, which would significantly simplify the code and make it easier to review. Instead, everything is masks and shifts with magic numbers.

            To give a concrete example of where I’ve given up trying to use LLD: FreeBSD kernel modules are .o files and the loader does static linking because kernel modules are loaded zero or one times on any running system and so they don’t need to pay the overheads for shared libraries. […]

            Err. I hadn’t heard of such a complain, but I’d say the code duplication is minimal and is comparable to mold’s. In comparison, binutils Apple ld64, bfd/elf* and gold/{powerpc,s390,x86_64,…}.cc have much more duplication.

            I know that many may argue that many tools can share code for object file format manipulation, but in practice the code duplication in llvm-project’s most commonly used ELF tools is of a reasonable scale and further code sharing is extremely difficult without hurting other properties (see my and others’ replies to https://discourse.llvm.org/t/object-file-modification-writing/65954).

            For relocation resolving, different ports do have different handling and there is a little but not too much code which can be shared. E.g. or32AArch64Imm in lld/ELF/Arch/AArch64.cpp and similar routines for other ports. Defining these helpers as static functions in the arch-specific file strikes a good balance to me as they can also be annotated with appropriate comments.

            The foremost task of lld/ELF is to do the end-to-end work like GNU ld’s ELF port, to be an almost drop-in replacement. There are many passes which don’t have very rigid execution order. Each pass computes a portion of the final output or refines some previous computation. A pass has inherent many assumption of what input is and (a) the input is difficult to serialize (b) serialization would likely greatly harm performance, so it just cannot be as flexible as transformations on LLVM IR. With these said, if a downstream project doesn’t mind copying some code, it is not too difficult to adaopt some code to do a specific task, but it’s probably not the upstream’s responsibility to expose every pass as a public API (which will certainly be unstable as I am still experimenting in ways to make lld/ELF faster.)

            The comment ratio is spectacularly low. Most of LLVM has doc comments on each class and field. Most of LLD is completely undocumented. For example, when I added a new relocation, I had to add it to three separate enum definitions. Why? There’s some abstraction in LLD that might make sense, but since it isn’t documented anywhere I doubt anyone except you or Rui knows why.

            I am not following, either… Running tokei for directories such as lld/ELF and llvm/lib/Transform show comparable comment ratios. Speaking of difficult-to-understand code I think Clang Sema and llvm/lib/CodeGen is much worse to deal with than lld…

            There is no separation of concerns. Everything assumes a specific linker flow. Trying to modify it to do anything else involves touching a load of different places. BFD ld provides built-in linker scripts to define the default behaviour for a platform. LLD doesn’t, and hard-codes this behaviour instead, but there is no target abstraction layer for defining this behaviour, just a load of if statements scattered through the code with things like ‘if this is MIPS, do this, unless it’s FreeBSD, then do this’. This makes adding a target that wants to do something that isn’t almost identical to an existing one very hard.

            As I’ve reported many GNU ld bugs, I actually think there is tons of GNU ld logic which is not representable in its internal linker scripts. It does print an internal linker script but it is far from its all logic. Some constructs in the linker script (e.g. .ctors in .init_array, RELRO) have special customization in its C code. There is more which is not representable at all in a linker script. E.g. it special cases many section names (special_sections_d in bfd/elf.c) even if one uses a linker script without specifying these sections. I have noticed this as I find that special section names may magically affect orphan section placement.