1. 33
  1.  

  2. 27

    This is a really great article. The root cause of a lot of these is that Chris originally couldn’t decide how much of the C abstract machine should end up in LLVM IR.

    As the article says, the ‘opaque pointer’ work has picked up recently. This would significantly reduce the diff we have for the CHERI LLVM port. There are a load of places were the lack of opaque pointers forces people to insert casts to i8*. This is normally fine, but with CHERI we use address space 0 to indicate 64-bit addresses that are protected by the default data capability and address space 200 indicate capabilities. If you cast a capability to i8* (which is implicitly in address-space 0) then you get an assertion failure in debug builds and nonsense in release builds. We have a load of patches that just get the address space and propagate it. With opaque pointers, these bitcasts go away and so do our patches.

    The constant expression thing was fixed in MLIR. MLIR doesn’t differentiate between instructions and constant expressions, constant expressions are just instructions that have constant inputs and no side effects. This makes passes over them a lot more uniform. I’d love to see that brought back to LLVM IR.

    The GEP observation is very interesting. Again, in the CHERI port, we have added a PTRADD node in SelectionDAG that has the same semantics as the author’s desired IR-level instruction. I think the motivation for GEP originally was twofold: it maps cleanly from source-level C constructs and it also lowers cleanly to complex addressing modes. Note; however, that neither of these is actually true anymore. For C / C++, clang often can use GEPs and LLVM IR structures, but it’s about the only front end that can. For Objective-C, we bitcast pointers to i8* and then do a GEP with an offset that’s calculated independently and never lower the type to an LLVM IR structure. Pony does the same thing (even though it knows the compile-time constant offsets). Alias analysis doesn’t get much help from the structure in a GEP: it just ends up calculating the offsets of GEPs and seeing if they alias. In the back end, GEPs aren’t used at all, they’re lowered to arithmetic and the back end pattern matches on DAGs that have arithmetic structures as the address operands to load/store nodes that match whatever addressing modes it has.

    I wonder if you can take this further and entirely remove array and structure types from LLVM IR without losing anything of value. The only place I can see where you actually need them is a side effect of my personal least-favourite aspect of LLVM IR: it doesn’t include the C type system and platform ABIs are defined in terms of the C type system, so instead there is an informal contract between the front- and back-ends on how certain C constructs are lowered. For example, on some ABIs struct { float x; float y;} is passed and returned differently to _Complex float and, because _Complex is not an abstraction at the IR layer, this difference is encoded in some ad-hoc ways, often involving struct types[1]. Most of the first-class aggregate work in LLVM is messy and generalising arrays, structs, and vectors with a single insert/extract pair of operations for manipulating them in registers would probably be cleaner.

    No one has yet proposed a good approach to the calling-convention lowering problem. I think the best idea would be to add explicit register arguments. It’s far easier, if I want C ABI compatibility, to read the platform ABI doc and just specify the registers it says than it is for me to try compiling things in clang and see what it generates and try to replicate it. It’s not clear how this would work with stack slots though (especially for mutable stack slots).

    [1] Some of these are insane. For example, on x86-32 systems that return small structs in registers, the return type is i64 and you need to insert shifts and masks to insert and extract the values.

    1. 4

      The root cause of a lot of these is that Chris originally couldn’t decide how much of the C abstract machine should end up in LLVM IR.

      I have a lot of sympathy for this kind of problem. You can’t tell what the ideal design would have been until after you’ve not only written the program but also used it for years and built an entire ecosystem.

      fixed in MLIR

      Is MLIR a CHERI specific thing?

      Instructions with constant operands make more sense than a separate construct for constants because the system needs to know how to fold those away anyway?

      I wonder if you can take this further and entirely remove array and structure types from LLVM IR without losing anything of value

      That would be aesthetically neat, you’d end up with a language kinda like bcpl but scarier. ;)

      1. 8

        I have a lot of sympathy for this kind of problem. You can’t tell what the ideal design would have been until after you’ve not only written the program but also used it for years and built an entire ecosystem.

        I don’t mean this as a criticism (well, not much of one: WHIRL and Phoenix both solved a few of these problems better and Chris could have taken more from both), it’s not obvious that some of these design choices are wrong until after the third or fourth front end and back end.

        Is MLIR a CHERI specific thing?

        No, MLIR is the new IR in the LLVM project that’s intended to be able to represent LLVM IR and higher (and lower)-level constructs. It’s mostly used by machine learning things (it has really good support for abstract operations on matrixes) but it’s intended to be a flexible representation.

        Instructions with constant operands make more sense than a separate construct for constants because the system needs to know how to fold those away anyway?

        Yup, and you end up with a lot of complexity in places like InstCombine because they have to handle instructions where one of the operands is a constant and the other is a constant expression. With a single representation, you’d just fold these together.

        WHIRL actually has a really nice model for this in their Hashed Static Single Assignment form. The patent on HSSA only expired a couple of years ago though, which was far too late to make it a core part of LLVM IR. In HSSA, you effectively get constant folding and common subexpression elimination for free.

        That would be aesthetically neat, you’d end up with a language kinda like bcpl but scarier. ;)

        Not quite. You’d still have a separate pointer type and separate integer and floating-point types. You’d lower something like this:

        int someInts[40];
        someInts[1] = 12;
        foo(someInts[1]);
        

        To something like this:

        %someInts = alloca 40, align 4
        %0 = getelementptr inbounds %someInts, 4
        store i32 12, %0
        %1 = load i32* %0
        call foo(%1)
        

        This captures all of the type information that you need for pointer provenance and alias information, but none of the extraneous info. You know that %0 and %someInts point to the same object because both are type ptr and one is derived from an in-bounds GEP of the other. Because the load and store are to the same pointer, you know that they alias and so it’s safe to constant-propagate the i32 12 from the store to the use of the load and eliminate the load.

        A lot of front ends actually generate LLVM IR that looks a lot like this already. They just use i8* for all pointers and address calculation and just specify types on loads and stores. Once the opaque pointer work is finished, they’ll just use ptr instead of i8* and their GEPs will be of i8 so that they’re just byte arithmetic.

        1. 2

          I don’t mean this as a criticism (well, not much of one: WHIRL and Phoenix both solved a few of these problems better and Chris could have taken more from both), it’s not obvious that some of these design choices are wrong until after the third or fourth front end and back end.

          For what it’s worth, your writing didn’t imply that you did. I wasn’t disagreeing with you, just pulling a “:/” face.

          Not quite. You’d still have a separate pointer type and separate integer and…

          I definitely misunderstood you because I thought you were describing a scheme where GEP wouldn’t exist either, just pointer arithmetic.

          1. 2

            Not quite. You’d still have a separate pointer type and separate integer and…

            I definitely misunderstood you because I thought you were describing a scheme where GEP wouldn’t exist either, just pointer arithmetic.

            GEP really is just pointer arithmetic. Nothing actually cares about the type information that the GEP uses. It’s completely valid to replace one GEP with another that computes the same offset (and some front ends do). I think there was a hope early on that the LLVM type info would be useful for alias analysis, but it isn’t actually used for that now. There’s separate metadata that the front end can emit to provide extra aliasing information (either for TBAA or restrict-like semantics). The GEP analysis logic just checks that two GEPs have the same base but different offsets to determine aliasing relationships.

            In BCPL, the only type that you had was a word: a pointer was just a word that you used as an address. The article explains why it’s a good idea to have a separate i64 and ptr type (for 64-bit systems). We recently updated the opaque pointers doc to contain very similar text.

      2. 2

        And this is a really great comment, that’s why I come here.