1. 53
    1. 33

      I don’t have any single C pointer in kernel, DRM, Panfrost, Wayland, Qt, KDE, entire user space graphical stack.

      To clarify: every C pointer in these components is lowered to a CHERI capability and so is unforgeable and has hardware-enforced bounds checks. The userspace ones also support revocation and so are immune to heap temporal safety efforts.

      The code is still all C. If you have a bounds error in your kernel code, then you may still panic the kernel and crash, but you won’t provide an arbitrary-code execution vulnerability for an attacker to use.

      Really looking forward to this version of CheriBSD. My Morello box is connected to my 40” monitor at work and being able to switch to using it locally will be very fun.

    2. 17

      This is a super big deal and I’m honestly amazed that it’s possible. Back when I first heard about CHERI (via Jonathan Woodruff’s really cool technical report, I think?) I thought well yeah this is cool but going from a good idea to running real-world applications is a world of hurt. (Also I probably thought eww, MIPS, but I digress).

      I’m sure this took a lot of effort but it also looks like CheriBSD emerged with a working implementation from the world of hurt. That’s an important validation for a project like this. The nineties and early 00s are littered with good ideas that turned out to be too convoluted, or not useful enough for hosting or developing useful applications, even when handled by good developers.

      I may not be entirely objective in my enthusiasm because I also secretly and uncomfortably harbor the idea that “accept we might crash but ensure it’s safe and recoverable” is not just a more attainable, but also a more practical approach to security in general and to memory safety in particular, and that it’s a model that ultimately results in better programs by pretty much any metric except single instance uptime. Shh!

      1. 18

        I thought well yeah this is cool but going from a good idea to running real-world applications is a world of hurt.

        It was. One of the biggest changes that I made to CHERI was to add an address field in the capability. The original version had a base and a bounds and required you to specify an offset in the load / store instructions. I ported clang / LLVM to target this model, but we ended up needing to move the base up to carry the current address and that meant that pointer subtraction didn’t work. Fine for things like iterating over an array but the biggest problem was the container-of idiom (originally from 4BSD, copied everwhere), where structures are embedded in others to give invasive data structures and offsetof is used to cast from the inner structure to the outer.

        To fix that, I extended the capability with a cursor that showed where inside the object you were. For a while, I hoped that we’d be able to support moving GC in C and so exposed this as an offset to software, such that normal C code was always safe to preempt and update pointers with new base addresses. Alex showed that this really, really wasn’t going to work and that compat was a lot better if you exposed the address directly.

        Jon did some amazing work based on the LowFAT pointer to trim capabilities down from 256 bits to 128, including a full base, top, 64-bit address, permissions, and otype for sealing.

        (Also I probably thought eww, MIPS, but I digress).

        MIPS was chosen because it was the only 64-bit ISA that had a mature software stack and was 20 years old (and so a non-patented implementation was possible). I have no idea whether Jon’s prototype managed to avoid patents. I had some improvements to the branch predictor that we never merged because I accidentally learned that Oracle owned a patent on the techniques that I’d used.

        I’m sure this took a lot of effort but it also looks like CheriBSD emerged with a working implementation from the world of hurt.

        Ruslan and others (especially Alfredo) have done some amazing work to get the kernel working as a pure-capability binary. Brooks did a lot of the heavy lifting before then to get the system call ABI to be memory safe but the kernel internally does a lot of things with virtual addresses and so getting that to be a fully CHERI-aware system has been a big effort.

        The nineties and early 00s are littered with good ideas that turned out to be too convoluted, or not useful enough for hosting or developing useful applications, even when handled by good developers.

        It’s possible that CHERI will be in this category, but I think the DSbD work is providing a lot of interesting demonstrators. The Linux port is still a few years behind the FreeBSD one, unfortunately, and the (lack of sane) abstractions in Linux mean that it’s probably an order of magnitude more engineering work.

        1. 4

          MIPS was chosen because it was the only 64-bit ISA that had a mature software stack and was 20 years old (and so a non-patented implementation was possible). I have no idea whether Jon’s prototype managed to avoid patents. I had some improvements to the branch predictor that we never merged because I accidentally learned that Oracle owned a patent on the techniques that I’d used.

          Oh, yeah, I don’t think there would’ve been a better choice at the time. I’m not sure there’s a better choice today, at least in the early research stage. RISC-V, maybe, and I remember reading something about RISC-V and CHERI support recently. I’m mostly frustrated that we don’t have a better, unencumbered architecture with stable tooling, not that MIPS is it (or was – things have changed a bit, I imagine). In a world where the 8051 instruction set was a thing, it could’ve been much worse :-).

          It’s possible that CHERI will be in this category, but I think the DSbD work is providing a lot of interesting demonstrators.

          Well, only time can tell, but I think CHERI has made it really far, and IMHO in good part because of sensible architecture and implementation choices. Admittedly, fine-grained memory protection hasn’t been that hot a topic in the ’90s and early ‘00s IIRC so maybe there just wasn’t enough interest in it to produce high quality research for a while. Barring some cool ideas like M-Machine and SPIN, a lot of hardware (and software) I remember reading about in this space was a little, erm, esoteric, I guess?

          1. 8

            Oh, yeah, I don’t think there would’ve been a better choice at the time. I’m not sure there’s a better choice today, at least in the early research stage. RISC-V, maybe, and I remember reading something about RISC-V and CHERI support recently. I’m mostly frustrated that we don’t have a better, unencumbered architecture with stable tooling, not that MIPS is it (or was – things have changed a bit, I imagine). In a world where the 8051 instruction set was a thing, it could’ve been much worse :-).

            RISC-V is definitely a better choice now (in part because MIPS support has become a lot worse: CheriBSD was the only surviving user of MIPS64 in FreeBSD and so that’s now going away upstream). We moved to RISC-V as the experimental platform over the last few years (and, of course, Morello is AArch64).

            Well, only time can tell, but I think CHERI has made it really far, and IMHO in good part because of sensible architecture and implementation choices. Admittedly, fine-grained memory protection hasn’t been that hot a topic in the ’90s and early ‘00s IIRC so maybe there just wasn’t enough interest in it to produce high quality research for a while. Barring some cool ideas like M-Machine and SPIN, a lot of hardware (and software) I remember reading about in this space was a little, erm, esoteric, I guess?

            We stole adapted ideas from a lot of places, from the Burroughs Large Systems architecture onwards. The M-Machine’s limitation of power-of-two-sized objects was a show stopper for a lot of real-world things (at least 25% memory overhead, close to 50% in some cases). There have also been commercial things such as Intel’s MPX (Intel: doing in hardware, what can be done more efficiently in software, since 1981) and Arm’s MTE (which builds on top of the tag controller design that we created for CHERI). MPX came out just before our first CHERI paper (ISCA 2014) and was fantastic timing because the overheads for CHERI were vastly lower than MPX and Intel was willing to pretend MPX was practical.

            The ’90s was not a great time for this because languages like Java were taking off and CPU speeds were increasing. Java code on a mid-90s CPU ran faster than C code on an early ’90s CPU and so it looked as if the long-term path would be to run everything in managed languages and rely on the fact that computers are really fast. The end of Dennard Scaling killed that dream and the subsequent multicore explosion took out a lot of SFI dreams (bounds checks without TOCTOU errors are really hard with actively malicious code in a system with multiple parallel writers to shared-memory).

            It’s worth noting that CHERI started (two years before I joined the project) as an in-address-space compartmentalisation system, not a C memory safety thing. Robert had a vague idea that we could get some C memory safety from it but that was a secondary goal. The starting point was that some things are very hard to do with Capsicum and other OS-based sandboxing technologies, wouldn’t it be nice if we could have some hardware support for them. The idea that we could use it for protecting every pointer in the system was something that didn’t really develop until around 2015 (I remember thinking that 9 authors on a paper was a lot back then, now that title page looks very short in comparison to the more recent CHERI papers!). Arm got involved very soon after that (they pushed hard for the compressed encoding, they weren’t sure if they could sell 128-bit pointers but they knew that they couldn’t sell 256-bit ones, especially not to the hardware partners who would need 256-bit data paths in performance-critical components).

            1. 5

              MPX came out just before our first CHERI paper (ISCA 2014) and was fantastic timing because the overheads for CHERI were vastly lower than MPX and Intel was willing to pretend MPX was practical.

              Ooh, I remember that one. I was no longer anywhere near academia by that time but I have a colleague who narrowly avoided doing a PhD on some MPX-related software at the time, based on nothing but a gut feeling, and chose some boring IoT topic instead. Everyone ridiculed him, then proceeded to kick themselves for 3+ years as they gradually realised how awful MPX turned out.

              It’s worth noting that CHERI started (two years before I joined the project) as an in-address-space compartmentalisation system, not a C memory safety thing. Robert had a vague idea that we could get some C memory safety from it but that was a secondary goal.

              I’m a bit out of my depth here (the little academic research I did was barely about computers per se) but I’m not really surprised that this is the case. Good ISA- and architecture-level ideas tend to spawn many practical solutions to closely-related but often orthogonal problems. The M-Machine is a good (though probably not the best) example. Folks at my university derided it as an esoteric design that never led to practical implementations good for anything. But it was a practical enough (despite having modest aims in this regard in the first place), wide-reaching and ambitious enough design that it moved the progress needle in several areas, from thread-level parallelism and scheduling to memory architecture.

              (Edit: honestly, my biggest takeaway from this, and the main reason why I’m trying to follow CHERI to whatever extent I can is that, to paraphrase Pike, systems research is no longer irrelevant. I have the feeling that the industry is slowly starting to extricate itself from that awkward corner it had painted itself in back when it was in denial about MOSFET scaling, so many research projects in these area inherently got filed under “yeah but why do we need this?” and “oh, that, well, the compiler will take care of it”).

            2. 2

              Beginner question on this stuff. I really like Capsicum because it pushes people to change the programming model for their software, to reduce ambient privilege. My intuition is that this impact on the code design is more important than the direct security benefits of kernel-supported capabilities in file descriptors. (If you design your software with Capsicum in mind, you will get a result that is also easier to adapt to different security mechanisms, for example multiprocesses or mini-hypervisors with RPC as iirc. Chromium does on some OSes.) Of course, getting this benefits for existing software requires refactoring/recompartementalizing them, which is rarely done in practice.

              As someone unfamiliar with CHERI, my impression is that while you get the same sort of “rethink the design” benefits in the lower layers of the system (“how do you implement a CHERI-nice memory allocator?”), these sort of porting marathons where you get a lot of code (here KDE) to run on this low-level substrate do not actually give you that much. Granted, several low-level attacks (a majority of C exploitation techniques) are impossible now, but you still have the same big monolithic design in the software stack that lets you attack things by abusing intra-processes bugs.

              So: does making KDE run on CHERI bring actual design or security benefits to KDE, including for non-CHERI users, or does it rather mean that the KDE codebase ended up not doing too many low-level hacks and that the lower building blocks are now CHERI-compatible?

              1. 3

                As someone unfamiliar with CHERI, my impression is that while you get the same sort of “rethink the design” benefits in the lower layers of the system

                Some, sure, but I think most of the sort of thinking that Capsicum makes you do for the global namespace is done for most programs in high-level languages (a category in which, today, I am including C): you allocate objects and you store data in objects. You think about pointers as references that each give you access to a specific object (possibly a field in the object). The problem is that all of that intentionality is lost during the compilation process and any integer value in the wrong place can end up giving you access to an arbitrary object.

                So: does making KDE run on CHERI bring actual design or security benefits to KDE, including for non-CHERI users, or does it rather mean that the KDE codebase ended up not doing too many low-level hacks and that the lower building blocks are now CHERI-compatible?

                We’ve found it very easy to upstream CHERI fixes to most projects because they are almost always cleaner code than was there originally and so upstreams are happy to take them.

                Note that some things are a bit harder. Kernels tend to think about addresses in some places and pointers in others. Even there, being more explicit about which is usually a good thing though.

                1. 1

                  I think most of the sort of thinking that Capsicum makes you do for the global namespace is done for most programs in high-level languages

                  With Capsicum, if I understand correctly, you can give up on filesystem access (except specific descriptors passed from the outside) or generally run with much lowered access to OS-provided resources, and no way to regain them from a portion of code that is not explicitly given access to them. While it’s possible to design APIs in high-level languages that make the same things possible, I believe that the vast majority of projects instead have much more ambient capabilities, for example I would suppose that all KDE software can access the filesystem through KDE/Qt APIs (Singleton objects or what not… in addition to the ability to just call POSIX stuff directly). So in practice, in terms of “exploitable code has full filesystem access”, Capsicum can bring a lot to KDE components – this helps reduce the surface of malicious exploitation, but also by enforcing software design decisions through runtime behavior rather than just documentation.

                  1. 1

                    With Capsicum, if I understand correctly, you can give up on filesystem access

                    More accurately, you give up on access to any global namespace. You can still access the filesystem, the network, or any other external resource, but only via a capability that is explicitly passed to you. You can’t just take a name and use it to materialise a token (file descriptor) that grants you the right to access the object. Similarly, with CHERI, you don’t give up the ability to access memory, you give up the ability to take an integer that identifies a virtual address and transform it into something that can access that address.

                    While it’s possible to design APIs in high-level languages that make the same things possible, I believe that the vast majority of projects instead have much more ambient capabilities, for example I would suppose that all KDE software can access the filesystem through KDE/Qt APIs (Singleton objects or what not… in addition to the ability to just call POSIX stuff directly).

                    Alex Richardson (who did a lot of the CHERI compiler work) did an undergraduate project extending some KDE bits to use Capsicum (I think it was him, he definitely did Okular, I can’t remember if he did the core KDE parts). His approach was to provide a string object that encapsulated a file descriptor as well as a path. When you invoked the KDE standard file dialog boxes, it would run a powerbox (a more privileged process that displayed the open / save UI) and would return a thing that looked like a string. You could then modify this using the normal path APIs and pass it to the file I/O APIs, where it would be transformed into openat with the embedded file descriptor.