1. 47
  1. 30

    I think I enjoyed reading that more than anything technical that I’ve read for a long time. First, it tells a lot of lies but they’re useful lies and it points them out as it goes. For example:

    (I cannot emphasize enough how shorthanded all of this is, the devil is extremely in the details and formally specifying these things is this subject of untold numbers of PhD theses. I am not trying to write a PhD thesis right now. Unless you literally work on a C/C++ Standard Committee or are named Ralf Jung I will not be accepting your Umm Actually’s on these definitions and terms.)

    I objected in the article a couple of days ago to the introduction of pointers as ‘variables storing an address’. This article goes into far more detail than I would about why this is a bad mental model but it also doesn’t give so much detail that you can’t possibly follow the point. I’ve worked with quite a few folks defining provenance models so I probably am on the list of people who can give ‘Umm Actually’s’ accurately but I don’t want to because everything that she said is a useful approximation and is sufficiently close that the difference would requite a much longer essay and convey nothing of value to most readers.

    Secondly, this was a very coherent description of CHERI from someone who, as far as I’m aware, didn’t work on the project. I led the language / compiler strand of the CHERI research from the point where we had a working assembler until the point where we could compile large C/C++ codebases and sandbox native code with JNI on CHERI and her proposal for changing Rust is exactly what I’d advocate (from the perspective of knowing far more about CHERI and far less about Rust than Aria, so the fact that we both think it’s the right approach gives me a lot more confidence that I’m not talking nonsense).

    CHERI C benefitted a lot from the fact that C was designed to be able to support segmented architectures. Given this comment in the article, maybe a bit more detail would help:

    (A lot of the issues we’ll see with integrating Rust with CHERI will actually look a lot like issues with segmented architectures, but I have never used those so I will just be vaguely gesturing at them and handwaving. Just keep in mind that whenever I refer to CHERI, a similar argument also probably applies to segmenting. So if you care about segmented architectures, you might care about CHERI too!)

    Capability architectures and segmented architectures have a lot of overlap. A lot of the historical capability architectures were also segmented architectures but not all segmented architecture are capability architectures. Often in a segmented architecture, object pointers are identified by a segment ID and interior pointers are identified by a segment ID and an offset. This model is also fairly common in managed-language VMs because it makes GC a bit easier if you never need to map from an interior pointer to an object. This doesn’t require provenance for pointers because you can take an arbitrary integer and convert it into a segment ID (segment IDs are just integers).

    CHERI benefitted a lot from the fact that C was designed to be able to support these architectures. In C, intptr_t is required to be able to hold a pointer (actually, it’s not required to exist, but if it does then it’s required to be able to hold a pointer). size_t is required to hold the size of any object. ptrdiff_t is required to hold the distance between any two pointers into the same object. This is why you can’t actually have a 64-bit address space in a conformant C implementation: if you have a 2^64 byte object then a signed 64-bit integer can’t hold the displacement between the two points because signed overflow is UB in C. Practically, this doesn’t matter because most ‘64-bit’ architectures only have 48-57 bits of address space and so size_t is actually a 48-bit unsigned integer that is zero extended to 64 bits, ptrdiff_t is a 49-bit signed integer that is sign-extended to 64 bits.

    CHERI’s vaddr_t is added because there is no guarantee that size_t can hold an address. On a segmented architecture, you may have a 16-bit segment ID pointing to a segment descriptor containing a 24-bit address and a 16-bit length. size_t would be 16 bits but void* and intptr_t would both be 32 bits (16-bit segment ID + 16-bit offset). Fixing this in Rust would be more invasive but is probably not worth the effort: architectures like this have largely died out because they were so difficult to support well in C (PL/M, the first low-level language that I learned, had much richer pointer support and made it easier to support this kind of thing but C won). Given the amount of legacy C/C++ code in the world, I don’t think it’s a serious limitation for a language to support only hardware that can be an efficient C target.

    In CHERI, Graeme Barnes at Arm proposed that we should make addresses signed as well. This has a few nice properties. It means that you don’t need to special case the capability that you get at boot that spans the whole address space (a 48-bit address space is two 47-bit address spaces, one at the top and one at the bottom of the address range). Purely aesthetically, it means that the kernel is in the bottom half of the address space (negative addresses) and userspace is in the top (positive addresses), with null in the middle, and so all of the diagrams that show the kernel under userspace are correct. Note that, in CHERI, nullptr is an untagged capability, and so you can have an object that spans the zero address and no valid pointer inside it will compare equal to null.

    1. 3

      we should make addresses signed as well.

      But then you can’t take the square root of a pointer!

      Oh wait, you can, since C99 has a complex type. And I guess only the kernel would run into this…

    2. 2

      with_addr (…) It lets us reconstitute provenance for the purposes of memory models / alias analysis

      This makes my head hurt a bit, but… how does this work with custom allocators? For example what’s the provenance of a pointer to an arena-allocated object? Let’s say an arena allocated object was passed to some FFI and another function received that address back from FFI.

      What is it supposed to do? arena_base.ptr.with_addr(the_object) ? Does that mess up any chance of optimising access to it?

      Edit: I’m getting this wrong, aren’t I? When receiving a pointer from FFI, I’d get the full thing, not just the address…

      1. 1

        Yeah, FFI should be using full pointers that are in the form of (provenance, addr). The bare integer addresses are meant to exist only temporarily while you do pointer arithmetic.

        The arena case is probably even trickier, because I’d expect it to use split_at_mut() when “allocating” a new object, and this gives two independent un-aliased pointers. These would be two independent objects from perspective of aliasing analysis. I don’t know if they’d be separate from CHERI perspective.

        1. 2

          These would be two independent objects from perspective of aliasing analysis. I don’t know if they’d be separate from CHERI perspective.

          CHERI doesn’t provide a mechanism combining two capabilities. If you take a capability that covers a large range, you can subdivide it into two sub-ranges but if you want to be able to access the entire range then you must do so from a capability derived from the original, not from either of the sub-ranges.

          1. 1

            (I just learned about CHERI from this article and reading bits of this introduction (PDF) and the CHERI Rust dissertation (PDF).) I think you can always derive a valid pointer covering a smaller range from a valid larger one, so when the split is done (and the parent slice no longer exists) those slices could be constructed with pointers such that no aliasing between them could be possible. The dissertation mentions this in section 5.4.3.