1. 188

Hi everyone! Long ago Lobsters used to run interviews on active users. I thought that was a great idea but noticed it hadn’t happened for some time. I thought about which lobster I’d most like to see interviewed and immediately thought of David Chisnall. Enjoy!

Introduce yourself, describe what you do for work and how long you’ve been at it.

Professionally, I am currently the Director of System Architecture at SCI Semiconductor. I’ve been at that for a couple of months. About the time I was thinking about leaving Microsoft, I was introduced to someone who was putting together a startup aiming to commercialise the CHERIoT platform (which I’d open sourced at Microsoft, but which Microsoft didn’t want to sell, just to use in other products). We got on well and so I joined his team to lead the evolution of the CHERIoT ISA and software stack. We aim to ship CHERIoT chips some time this year, at which point I may finally be able to buy IoT devices that I trust to connect to my network and the Internet, for the first time. The company is still starting up, but I expect to be hiring very soon.

That covers the very recent bits, but my career has not gone in a straight line at all. I went to Swansea University because I did very badly in my A-level exams (the ones you take at 18 in the UK) and so didn’t get into any of the universities that I’d applied to. Swansea was most well known as the UK university closest to the sea. It was also home of the Swansea University Computer Society (yes, we know the acronym sucks) which popped up in the boot screen for Linux at the time because they’d written the original Linux TCP/IP stack. Folks like Alan Cox still hung around society events. I found the balance between a very theoretical course (Computer Science in Swansea acknowledged that there were machines that approximated universal models of computation, but wasn’t really in favour of touching them) and a very hands-on computer society gave a usefully broad education.

I ended up staying there for a PhD, where I met Nicolas Roard (funded on the same grant), who introduced me to Objective-C and OpenStep. I had a Mac, but hadn’t really planned on doing anything with Cocoa (which implements the OpenStep specification) because I wanted to write cross-platform things. Nicolas was a GNUstep contributor and persuaded me that it was worth writing Mac software and improving GNUstep so that it had the features to run it.

Around this time, Apple was rapidly adding Objective-C features and FSF GCC was not keeping up. I ended up being persuaded to write a new Objective-C runtime and add support for it to clang and discovered that I really enjoyed language implementation. The runtime I wrote was explicitly designed to support multiple languages and that led to a much deeper rabbit hole.

In parallel with this, I’d been actively involved in Jabber / XMPP and written a couple of clients. Peter Saint-Andre, who was driving the effort, posted to the mailing list that he had a contract for a book about the protocol but no time to write it and asked if anyone else was interested and, when I replied, he put me in touch with his editor. It turned out that another publisher had just released a Jabber book and it had not sold well, so the publisher was happy to let the project die, but they gave me some work-for-hire on a Linux book that was behind schedule. This went well and led to some regular writing for InformIT. They also asked me if I knew anyone who would be qualified to write a book on the Xen internals. I jokingly said ‘I could do it if you gave me six months to study the source code’. Six months later they came back with ‘so, that Xen book you’re going to write…’ and I ended up writing it. This was very much active procrastination: I wrote the book while avoiding working on my PhD thesis and the thesis while avoiding the book. I subsequently wrote three more books for them.

I spent the next five years working freelance, writing some code and some more books (and a lot of articles) before I came to Cambridge to work on CHERI.

What is your work/computing environment like?

I run macOS locally and use a mix of native, FreeBSD, and Linux VMs (some managed by Docker and Podman). Most of the remote machines I use are FreeBSD, with some Linux. On the floor next to me there’s a Morello desktop (CheriBSD), which I mostly use remotely. The M2 Max is fast enough that I don’t need to do much remote anymore.

What tools do you use to write software?

I keep finding myself going back to vim, not because it’s great, but because nothing else meets my requirements the same way. I want an editor that I can use with local and remote projects. VS Code has a remote extension but, because it works by running a headless instance of most of VS Code remotely, there’s a long tail of places where it doesn’t work, so I end up needing a local editor and a remote one and it’s easier if they’re the same.

Within vim, I use ALE for talking to clangd and other LSP servers (this lets me use a CHERI-aware clangd nice and easily, so all of the language extensions are supported) and I’m trying the Copilot plugin. I’m still unconvinced by Copilot but, at worst, it’s only a small net productivity drop to use it and it’s important to keep abreast of new developments in tooling.

I generally use clang for compiling, lldb for debugging. I test things with GCC if GCC supports the target and source language. Outside of Smalltalk, I’ve not found an IDE that I felt made me more productive.

How did you get started with CHERI?

I knew Robert Watson through FreeBSD. I’d ported libc++ to FreeBSD (which mostly involved implementing the POSIX 2008 xlocale APIs in libc) under contract with the FreeBSD Foundation and he’d been involved in organising that. At the time, I was still living in Swansea, which has a very low cost of living (it still does). This let me do a few days of paid work per month and spend the rest of the time on fun projects. His pitch was that, if I came to the University of Cambridge, I could work on the fun stuff full time and be paid for it.

I’d spent a lot of the previous years on Étoilé, which was a project to build a user-focused desktop environment that was built out of composable components with end-user programming as a key focus. We were inspired by the STEPS project at VPRI, which tried to build an entire system in under 20,000 lines of code. Our rule was simpler: we aimed to keep individual components to under 10 KLoC, which is small enough that a single person can understand it. This meant that we needed to be able to both use expressive languages and build expressive DSLs. We were starting from an Objective-C base, which gave us a nice model for late-bound components but brought along a lot of C baggage.

Unlike STEPs, I didn’t want to rewrite the world in high-level languages. I wanted to use things like libavcodec and libavformat as-is, but without bugs in them being able to destroy the invariants that higher-level software depended on. I’d tried building isolation mechanisms with the MMU and found it severely limiting. I’d also looked at Mondrian Memory Protection, but the table-based approach didn’t compose well with language-level abstractions. Early CHERI wasn’t the right thing either, but it was close enough that I felt I could evolve it into the right shape.

Most of my fingerprints in CHERI ISAs are with that goal in mind. I want to be able to compile existing C/C++ libraries for a CHERI compiler and use them safely from higher-level languages and use them directly. I’ve written a bit about this before:

https://www.linkedin.com/pulse/i-dont-care-memory-safety-david-chisnall

I want to be able to have documents embed scripting-language programs that can directly call large native libraries and still have strong guarantees that my system won’t be compromised. The key point is this observation:

Isolation is easy, (safe) sharing is hard.

It’s trivial to fully isolate two components. Separate cores, sandboxed processes, or WebAssembly sandboxes can give this kind of isolation, depending on the degree of isolation that you need. Most interesting things are built from communicating components and keeping things mostly isolated, but able to communicate safely, is much harder. For example, Rust says FFI is unsafe, but if you wanted it to be safe except that objects passed from Rust to C may contain arbitrary bit patterns after the call, that’s harder. You can do it with deep copies, but that’s a lot of overhead and very hard to do in the general case. You can do it with CHERI fairly easily, including richer things like deep immutability (in CHERIoT, we can also provide shallow and deep no-capture guarantees).

What was the hardest technical problem you’ve ever had to address?

That’s difficult to answer. Any problem seems easy when you know how to solve it and it’s easy to forget how much effort it took. Teaching a C/C++ toolchain that pointers and integers were, in fact, not interchangeable types might be in that space.

The one with the most evil solution is probably related to C++ exception interoperability with Objective-C. There are three C++ runtimes in wide use: libsupc++ (GNU), libc++abi (LLVM), and libcxxrt (which I wrote and is shipped by FreeBSD, Sony Playstations, OpenEnclave, and a bunch of other places). The story there is slightly annoying: I wrote libcxxrt for a company that agreed to open source it if other organisations would split the development cost. They approached the FreeBSD and NetBSD Foundations, who agreed to pay for some of it, along with Apple (who, at the time, were shipping libc++ but without a permissively licensed runtime). A few days before the public release, we sent Apple a polite heads up that we were going to be doing the release. They responded by creating the libc++abi project in LLVM, with a demangler that was much too large and generic for the runtime and none of the code that’s actually necessary.

History aside, all of these have very subtly different layouts of the structure that encapsulates an exception. This is not part of their public APIs (or ABIs) and has, in the past, changed between versions. We did have a load of compile-time detection for this in the GNUstep Objective-C runtime, but it proved somewhat fragile. Eventually, I rewrote it with some run-time detection that works by throwing a C++ exception though a stack frame with a custom personality function. The C++ exception is a custom type and contains a specific bit pattern as its value. We just poke at the memory for the object until we find the pointer to the typeid object for our type and until we find the bit pattern of the thrown object. This lets us find the two fields that we care about.

Do you have a long-term software pipe dream?

I still think of myself as working on Étoilé. I simply found that I can’t build it with existing software and programming languages and so took a little diversion to build the lower layers that I need. I need CHERI to be able to do fine-grained sharing with legacy libraries (i.e. the billions of lines of code that have been written already that I don’t want to rewrite). With Verona, we have a language that has sandboxing as a first-class abstraction so that we can reason about things like two independent instances of libavcodec for different movies in the same document, at the source level. We also have a deadlock-free and data-race-free concurrency model that we can use as a coordination language, which lets us build DSLs and end-user-programming systems that give strong local reasoning guarantees for individual components.

When those two things are mature, the things that we wanted to build in Étoilé are feasible and I can get back to building them.

Do you have any suggestions on how one may turn their open source interests into a fulfilling career?

Any form of success requires some luck. Remember that any time someone is telling you that an approach will lead to success, they’re working from a sample size of one and have no idea how many people tried the same thing and failed.

When I was a PhD student, someone pointed me at a study that showed that the main difference between people that self identified as lucky and those that didn’t was that the former were better at recognising and taking opportunities that were presented. Some people don’t have the opportunities at all, but a lot more do and miss them. A couple of examples:

When I saw Peter’s email about writing a book, I followed up and that led to an entire writing career for several years. I don’t write professionally anymore, but that experience has been useful in many of the things that I’ve done subsequently. I’d been pondering writing as a career option but really had no idea how to do it. I could easily have not had this opportunity (I was lucky to be working on XMPP at the time) and I could easily have not taken advantage of it.

When I started working with each of GNUstep, FreeBSD, and LLVM, I joined the mailing lists and IRC channels and helped people who had problems. There’s always a spectrum of people that want help. Some are individuals doing things for a hobby, some are working for companies and really need the help. Some of those companies have a decent contracting budget because they know that it’s much faster to pay someone who knows how to solve a problem now than it is to build that in-house expertise. Several conversations I had led to people offering to pay me to do the thing that they were trying to do, rather than finish explaining it to them (a few also led to requests to do training for their company).

A big part of the reason that this worked for me was that most of the things I worked on were used by companies that understood software development. It’s much easier to sell software development services to a company that employs software developers than it is to a company that doesn’t really know what that means. If you’re working on applications that target end users, I don’t think the same approach would work.

A lot of contracting work comes as a result of having a solid reputation. If people know that you know a codebase well, can be trusted to get work done, and aren’t a total pain to work with, they’re likely to consider hiring you. Open source lets you build this kind of reputation in public.

Permissive licenses also help a lot. We knew of a couple of companies shipping products using GNUstep, but violating the LGPL. The FSF didn’t want to take them to court and, because they were in violation of the license, they didn’t want to hire anyone who worked upstream and might report them. Unless you’re willing to pay lawyers to enforce license conditions, you won’t stop people infringing the license but you will put people off if they’re not 100% sure that they can comply. In contrast, I’ve worked with several companies that decided to do in-house forks of permissively licensed things and then changed their minds later and wanted to either upstream things or have things added upstream that let them ditch their fork. The permissive license lets them build a dependency on your project and then they grow a need for people with your expertise, with no effort on your part. Now you have a market for the exact skills that you have.

A lot of people slap restrictive copyleft licenses on things because they don’t want companies to benefit from their work, but that also means that those same companies have no incentive to contribute (code or money) to the project. If your project is valuable and people don’t like the license, they will either reimplement it or they will violate the license and hope that they don’t get caught. Neither of these benefits you. I think the GPL has led to fairly noticeable increase in the amount of proprietary software in the world as companies that would happily adopt a BSDL component decide to create an in-house proprietary version rather than adopt a GPL’d component. This benefits nobody but it especially won’t benefit you if you want someone to pay you to work on the GPL’d component.

    1. 51

      Thank you for this!

      It seems like cheating though because David Chisnall is clearly at least ten people.

      1. 11

        @pushcx would it be possible to add the “interview” tag to this post?

        1. [Comment removed by author]

          1. 3

            It’s linked in OP’s first full sentence.

            1. 1

              Right you are.

        2. 8

          Excellent quick interview. Thanks!

          1. 7

            I appreciate David’s counter-intuitive take on gpl licenses and insights on the economics of upstream and downstream projects.

            1. 4

              David’s 2009 post (https://www.informit.com/articles/article.aspx?p=1390172 “The Failure of the GPL”) has a long discussion on this topic!

              1. 1

                Read the first page on Objective-C and clang. Looks like a great article. Will read it through.

              2. 2

                When you look up the big tech cos in the US that just blanket ban the use, hell even the downloading of GPL-ish software in the vicinity of their source code repos, I think the GPL has increased the amount of purely proprietary software out there.

              3. 6

                Thanks for the interview! I have met David at various conferences and at Cambridge so I had some idea, but having the extra detail filled in is really nice.

                1. 6

                  The last 2 paragraphs are some of the most concisely worded, non-rude, arguments for permissive licensing (BSD, MIT, etc) I’ve read. Very very cool to see the career progression, almost makes me wish I’d studied comp sci instead of mat sci. Hope to see neat things from SCI! Good work (interviewer and interviewee).

                  1. 6

                    Never knew how you came into FreeBSD and LLVM, thanks for the detailed interview

                    1. 11

                      I started using FreeBSD as an undergrad. One of my housemates (Sitsofe Wheeler) set up OpenBSD for our router and encouraged me to look at non-Linux open source operating systems. The OpenBSD installer promised it would destroy all of my data, which scared me away from installing it on a dual-boot system. The FreeBSD one was a bit less scary.

                      The thing that made me stick with FreeBSD was working sound. Linux was going through a painful transition from OSS (upstream had gone proprietary) to ALSA. There was an OSS compat layer for ALSA but you only got mixing if you used the native ALSA APIs and most things hadn’t been rewritten to use ALSA. And, as I recall, there was no software mixing, so you got one channel per hardware mixing channel (I had two sound cards: one that had a load of hardware channels and no Linux drivers, one that had Linux drivers but no hardware mixing, so this didn’t help). GNOME and KDE worked around this by providing userspace sound daemons. Unfortunately, they both provided one, with incompatible APIs.

                      With FreeBSD 4.x, there was low-latency in-kernel sound mixing. You had to manually create device nodes for each vchan, but then you could direct things to use them and they exposed the OSS APIs so anything that made sound just worked. This was great. I set up one for the KDE sound daemon, one for the GNOME one, one for XMMS and left the default one unused so one program that didn’t let you specify the sound device could work (BZFlag, mostly). I got new-message notifications from Psi (KDE), new mail notifications from Evolution (GNOME) and music played in the background. This was trivial on Windows or OS X but impossible in Linux.

                      FreeBSD 5 came along and added devfs, removing the need to manually create device nodes. For the last 20 years, sound has Just Worked on FreeBSD and I’ve watched Linux go through two additional painful transitions in the same time.

                      I stayed for the respect for the principle of least astonishment (90% of what I learned 20 years ago still works, I only need to learn new things to learn how to do things that weren’t possible back then) and for features like capsicum, jails, kqueue, ZFS, and umtx, which make developing on FreeBSD a far more pleasant experience than Linux.

                      1. 4

                        I’ve watched Linux go through two additional painful transitions in the same time.

                        Two? The switch to Pulseaudio looks like it was painful, but damn if moving to PipeWire hasn’t been smooth as heck IME. Pulseaudio apps, JACK apps, and even bluetooth audio, Just Work. I got 1.3 ms of latency in REAPER with zero setup effort.

                        1. 5

                          PipeWire has “fixed” Linux audio for me, closet thing to Coreaudio in the FOSS world. I do agree with David, the FreeBSD kernel audio stack is very elegant and not often talked about, especially compared to ALSA and all the userspace layering Linux has gone through, but ultimately you’d want PipeWire on FreeBSD too so the low level stack is only interesting to those working on drivers and new features.

                    2. 5

                      @david_chisnall When you said there are three widely used C++ runtimes, did you forget the Visual C++ one? Or is Visual C++ using one of the permissively licensed runtimes now?

                      1. 8

                        Sorry, I should have specified for the Itanium ABI. The Visual Studio one uses a different ABI. On Windows libobjc2 uses its exception ABI for Objective-C++ interop. As of last week, it can also use the MinGW exception ABI which is a weird hybrid of the two.

                      2. 4

                        Great interview!

                        Delight turned quickly to panic when I read the section about Jabber/XMPP. Delight to see mention of some good old times (I was a part of that community back then) and then panic reading “another publisher had just released a Jabber book and it had not sold well” but I realise that from the timing described, it probably wasn’t my book to which David was referring (at least I hope it wasn’t! :-)). Phew.

                        1. 3

                          It was your book! I actually have a copy on the shelf by my desk! If memory serves it was as part of a programme where O’Reilly gave free books (sorry - I was a poor student) to people who promised to write a review of them and I really enjoyed it. I also got the Mozilla / XUL book at the same time.

                          I’m not sure how many you sold, but Prentice Hall really wanted at least 2,000, ideally 3,000 sales in the first year for it to be worthwhile for them. Their visibility into your sales (no idea how they got this, presumably feedback from distributors?) suggested that they wouldn’t be able to reach that. I guess they’d have needed to see your book sell at least 5-6K copies to give confidence that the market was big enough for a second book to do well.

                          1. 3

                            Yikes! Well, I got plenty of royalties (not that I did it for that). And they continued for quite a few years after. Ah well, I enjoyed writing it, and it was certainly a new experience for me, and one that I’ll never forget. I can’t remember how many were sold in total, but I would not have thought it would have been 5-6K in the first year, that’s quite a lot for what was arguably still quite a niche tech, even back then. Anyway, you know what’s far better than knowing I had great sales? Connecting with folks like you who have my book on their shelves. That’s bonkers (in a good way) :-)

                            And thank you for telling me you enjoyed it!

                        2. 4

                          When those two things are mature, the things that we wanted to build in Étoilé are feasible and I can get back to building them.

                          When I get my billion dollars, I’ll let you know when you can make your CHERI Étoilé laptop.

                          1. 2

                            @david_chisnall What’s the link between CHERI and Verona? Is it just a case of both things being developed within Microsoft at around the same time?

                            I’ve been keeping an eye on it for a while (after spending some time with Pony) and I would love to see it grow and eventually get something to play with.

                            1. 16

                              What’s the link between CHERI and Verona?

                              Verona has some things in the type system that three of us were approaching from different directions and from different use cases:

                              • Matthew Parkinson had worked on Project Snowflake (adding manual memory management to .NET) and found that it was hard to isolate a memory-management policy. He wanted a source-level abstraction for a set of objects that you could tie different policies to. This lets you do things like have a tracing GC for a set of objects, without requiring any barriers from things that don’t have access to this region.
                              • Sylvan Clebsch came from designing Pony and wanted a language-level abstraction that let you reason about an arbitrary object graph that would let you transfer ownership of it between actors (or other units of concurrency) without needing a GC trace on message send.
                              • I had worked on various sandboxing things, including CHERI JNI and Object Spaces for Smalltalk / Objective-C, and wanted a language-level abstraction for a group of related objects, including a set that may be using a foreign object model or have weaker type-safety guarantees, so that I could both reuse existing libraries in low-level unsafe languages and have a set of nice guarantees for safe user-facing languages.

                              It turned out that we all wanted the same thing. The Verona region model is conceptually quite simple:

                              • Every object is allocated in exactly one region.
                              • Within a region, any object may hold pointers to any other object(s).
                              • Each region may have a different memory-management policy (e.g. tracing, ref counting with cycles leaking, ref counting with cycle detection, bump allocation with bulk free, and so on).
                              • Each region has a sentinel object that dominates all (live) objects in the region.
                              • Pointers to the sentinel object from outside are linear, moving this pointer transfers ownership of the region and any regions reachable from it (regions, therefore, can be arranged in trees).

                              This is a really nice abstraction for building interoperability layers. Rather than thinking about foreign function interfaces, we think about foreign library interfaces. The idea is that you’ll have each library[1] exposed as a class in a region, where each function exposed by that library is a method on that class, each global is a field, and any types are nested types. If you allocate new objects via that library interface, they will exist in the same region as the library. This means:

                              • You can instantiate multiple copies of libraries[2], so you can isolate different uses of foreign code.
                              • You know, at the source level, which library instance any object belongs to. You get a type error if one library tries to refer to an object that a different library or a different instance of a library belongs to.
                              • You can use arbitrary sandboxing models to isolate different library instances. This means that foreign code doesn’t mean unsafe (the thing I hate about Rust), you still get all of the type-safety and concurrency-safety guarantees of Verona in your Verona code, no matter how much foreign code you use.

                              Nothing here is specific to CHERI. I have a prototype that implements the abstraction using processes, you could also do it via WAsm, via MMU isolation with some better OS abstractions than processes (I had an intern add some nice things to Linux for this), or via CHERI.

                              It’s just that CHERI provides a really nice set of primitives for doing this. Sharing code is trivial, safe calls into a library are cheap, and you can do things like deep-immutable sharing if you want to expose a Verona object graph into a foreign sandbox but keep the guarantee that it can’t be mutated (this requires copying with any other approach, unless the objects are on an isolated set of pages and so can be mapped read-only into the sandbox).

                              [1] A library at the Verona level. This may correspond to a group of DLL / .so files.

                              [2] Again, at the abstract machine level. This doesn’t necessarily have to include a copy of its code, though it will include a copy of any mutable globals.