1. 18
  1.  

  2. 2

    Fuzzing will help reduce the amount of bugs, but no amount of fuzzing will actually fix these systems. The trusted computer base is too large.

    There’s going to be bugs in there, ready to cause trouble when least convenient.

    1. 5

      I read the latest whitepaper on seL4 you linked to last time Linux was discussed here a few days ago and perused their website where I was left with the impression - correct me if I’m wrong - that the project has not really left the alpha-stage yet. Furthermore it more or less depends on Linux for ‘real life’ driver support, with ‘minimised’ Linux installations running in user space being used as interface between hardware and protocol drivers. To tie everything together into a usable system a ‘component framework’ is used, the examples in the whitepaper are the verified-but-limited CAmkES and unverified-but-‘more powerful’ Genode. The former seems to be targeted at ‘embedded and cyber-physical systems’, a term which is not explained but what I deem to be more or less the equivalent of the popular ‘internet of things’ (IoT), a generally static environment with limited flexibility. For general purpose computing - say the server cabinet under the stairs here - the unverified ‘Genode’ framework is more applicable (or should that be rephrased as ‘needed’?), thereby throwing out the idea of running a fully verified system. After a single page on performance - ‘we have the fastest IPC of all microkernels, between 2 and 10 times faster than the competition’ the paper ends with a single example (overview really) of an implementation done by a DARPA-funded team at CSIRO et al.

      The seL4 microkernel consists of about 10K lines of code, The set of proofs has by now grown to well over a million lines, most of this manually written and then machine checked (a direct quote from the paper). The initial verification set comprised of about 200K LoC, a factor of 1:20 for SLoC:VLoC. Verifying systems is expensive (in developer time).

      Where does this leave me in my ideas of using something like seL4 where I’d currently use Linux? Well, I could see it being used for embedded applications since these tend to be statically defined and limited in their scope. Given the data I’ve thus far seen I do not see myself running seL4 for more general purpose computing needs - the server-under-the-stairs application being one of them apart from possibly using it (or something related) as a hypervisor in applications where the host is used to run multiple mutually adversarial guests (to which you might - with some justification - say that all guests are potentially adversarial). It could be used as a basis to develop a VMware ESX or Proxmox VE competitor - now here’s an interesting project for someone looking for ideas. Hardware support is limited (https://docs.sel4.systems/Hardware/) and there is only a single ‘fully verfied’ platform, this being based on ARM A9 - i.e. not a powerhouse. Verification status on X64 is incomplete and explicitly excludes those technologies needed to support virtualisation.

      For general purpose computing these systems are just not there yet and the question is whether they will ever be given the current state of development and verification systems. The reason is simple: the development effort spent on verifying the system overshadows the effort spent in designing and building it by a very large margin. While a customer like DARPA might be willing to spend this amount of time and money (the mentioned project ran from 2012 to 2017) this is not true for most customers.

      1. 1

        I was left with the impression - correct me if I’m wrong - that the project has not really left the alpha-stage yet.

        My impression was the opposite of this. seL4 itself is pretty much a mature product, by the most experienced team on earth on this field and a legacy of a very successful OKL4 preceding it. seL4 provides all the functionality needed to build any kind of system on top of it, along with the proofs to support this functionality.

        Of course, verification seems to lag the latest version by a couple years, but then again nobody else provides the sort of mixed criticality support the latest version of seL4 has; Verified seL4 is always behind, but it is still the state of the art in verified systems.

        I would love to hear about and understand your line of reasoning for considering seL4 pre-alpha despite all this.

        Verifying systems is expensive (in developer time).

        Thus the importance of minimizing the TCB can’t be understated. This is why it is important to have a highly competitive formally verified microkernel.

        It more or less depends on Linux for ‘real life’ driver support.

        While there’s a growing collection of drivers for CAmkES and that is the better approach, since the VMMs that actually handle VM exits/exceptions are per-vm user programs which cannot exceed their capabilities, I see it as a really good clutch, particularly with the help of an IOMMU.

        As the VMM isn’t limited to supporting Linux guests, it is easy to tell Linux is being exploited here: It’s used as an example because it is popular, and “being able to reuse investment in Linux” is helpful for obtaining funding, public and private.

        After all, there’s a positive feedback loop to be found there. Every time seL4 gets used, drivers and component functionality are written, experience is accumulated, seL4 becomes more useful and thus appealing to use and thus is used more and thus more functionality is written, and so on and so on.

        CAmkES / a generally static environment with limited flexibility.

        This is indeed the current state of affairs for CAmkES, as static environments are definitely easier than dynamic ones, and also sufficient for many applications, particularly the sort of applications that most need the assurance seL4 provides. Thus the ones important for commercial interests and funding.

        Genode does very well to demonstrate that dynamic environments are possible, and the language used in the whitepaper suggests CAmkES will, too, eventually get there into the dynamic club. They just have a set of priorities, understand the importance of funding, and know that the commercial interest they can capture at this stage does not require dynamic scenarios.

        possibly using it (or something related) as a hypervisor in applications where the host is used to run multiple mutually adversarial guests (to which you might - with some justification - say that all guests are potentially adversarial). It could be used as a basis to develop a VMware ESX or Proxmox VE competitor - now here’s an interesting project for someone looking for ideas.

        This is one scenario I have given a lot of thought to. Again, I expect this to come from Genode first, as they are at the forefront of providing functionality. As it is built around capabilities as first class concept; as the kernel is proven to be able to enforce them, it is far more trustworthy than anything you can build on top of Linux.

        Going by what they have already achieved, and what their roadmaps are, Genode will soon have demo scenarios for these roles:

        • Network infrastructure (switching and routing)
        • Audio workstation (a routing and mixing PoC)
        • Virtualization server

        Due to being in the forefront by providing functionality, I see Genode inevitably as the way that most people outside the high assurance world will be first exposed to seL4 and capabilities.

        For general purpose computing these systems are just not there yet and the question is whether they will ever be given the current state of development and verification systems. The reason is simple: the development effort spent on verifying the system overshadows the effort spent in designing and building it by a very large margin.

        It would definitely be nice if the full system was verified. While that would be ideal, it is of questionable feasibility. The focus of designs such as seL4 is to reduce the TCB i.e. minimize how much code we actually have to trust.

        The approach I do favor is to look at it the other way around: If the kernel can’t be trusted (i.e. any not formally-proven kernel), then its ability to provide separation cannot be trusted, and thus there’s no separation of TCB and non TCB. Everything on the system is TCB and we know for starters the kernel is not trust-worthy, so it’s a losing proposition. A design dead end.

        As an added note, if you haven’t yet, I do recommend going through:

        https://fosdem.org/2020/schedule/track/microkernels_and_component_based_os/

        I particularly enjoyed Gernot Heiser’s (seL4) and Norman Feske’s (Genode) talks.

        1. 2

          If the kernel can’t be trusted (i.e. any not formally-proven kernel),

          There are other ways to provide assurance besides formal proofs. A proof may even be worse than simpler techniques, if it gives you false confidence because you don’t really understand what’s being proven. Security properties are typically not easily composable, so this is a very practical concern, I think.

          then its ability to provide separation cannot be trusted

          … fair enough …

          and thus there’s no separation

          Nope. Does not follow. The converse argument is especially false! The kernel still relies on hardware, and we’ve seen these “guarantees” so thoroughly violated in so many ways in the past few years, I think developing secure hardware is the more important concern at present. Architectures like CHERI are resurrecting the old idea of capability systems in the processor and memory pipeline. Some applications can even use fine-grained hardware capabilities without any assistance from the kernel at all. I would expect to see practical capability-enforcing hardware platforms long before seL4 gets any significant real-world adoption, because they allow us to extend our legacy software with minimal intervention. I don’t like it, but that’s how I see it.

          1. 1

            I would expect to see practical capability-enforcing hardware platforms long before seL4 gets any significant real-world adoption, because they allow us to extend our legacy software with minimal intervention. I don’t like it, but that’s how I see it.

            I am particularly interested on “capability-enforcing” hardware and how it somehow abstracts capabilities from software, while providing the same benefits.

            Specifically, intervention so minimal that can bypass the already minimal effort of the retrofit approach described in chapter 7 of the seL4 whitepaper

            Can you point me somewhere good to start learning about this?

            The converse argument is especially false! The kernel still relies on hardware

            Every operating system does. But pointing at hardware and saying “they are also doing it wrong!” doesn’t excuse the software for doing a poor job.

            1. 3

              I am particularly interested on “capability-enforcing” hardware and how it somehow abstracts capabilities from software, while providing the same benefits.

              My point is just that secure software needs secure hardware, if the system is to be secure. If the hardware can back your separation guarantees directly, the software doesn’t need to be quite as careful or sophisticated. We already knew this in the mid-sixties, but (like so many things) we somehow forgot.

              Can you point me somewhere good to start learning about this?

              Here, I’ll spare you a couple of clicks from the CHERI page I already linked. Plenty of reading material there. There are other ongoing projects with basically the same idea, but I think CHERI is furthest along, and has a cozy relationship with ARM Holdings, so I’d expect to see it go commercial pretty soon. They did CHERI-BSD on the backs of a few grad students, by comparison with the effort spent on seL4. It’s not quite a drop-in replacement for mainline FreeBSD, but pretty close.

              1. 1

                Oh, CHERI I had heard of. I wouldn’t have thought you were talking about it in this context, as retrofitting it into BSD isn’t in any way easier than the retrofit approach presented in the seL4 whitepaper. If what you want is for it to yield the same level of advantages, you’ll find yourself reinventing seL4.

                CHERI-BSD presents progress, particularly as a mitigation for userspace issues, but it is unfortunately not going to cut it for high assurance scenarios, nor hard real time applications. As these are excluded, BSDs (and UNIX in general) will continue to be unsuitable for the construction of a truly general purpose operating system.

                You might be interested in this keynote: https://www.sel4-us.org/summit/presentations/Session_1_Talk_1_Neumann_Keynote_Assured_System.pdf

                Particularly slide 51, but it’s a good presentation throughout.

                Personally, I am not sure adding complexity to the hardware will improve anything. Complexity needs to be justified is my motto. Verifying the hardware will only be harder if it’s more complex. Yes, non-RISC architectures are disqualified to begin with.

      2. 4

        I think the rough consensus is that as long as fuzzing is the best that attackers can do, defenders doing it better (and first) is good enough. It has shown itself to be a pretty cost-effective approach, overall.

        1. 4

          My understanding is that Linux is swamped by fuzzing bugs and defenders aren’t getting ahead of attackers, instead attackers are having easy time exploiting because they can just take fuzzing bugs published and unfixed.

          A good example is Bad Binder bug. Originally found by fuzzing and published in 2017, it was being exploited in the wild in 2019, 2 years later. I think Linux security is pretty hopeless now.

          1. 2

            Yeah. It only works if the bugs actually get fixed. For Linux, I imagine it’s pretty hard to even assign responsibility… and the iron-clad commitment to backwards compatibility sure can’t help. Browsers seems to be doing better.

            1. 2

              Your general point may be true, but I guess Binder is a bad example, since it is an Android-specific component and not part of upstream Linux?

            2. 1

              Fuzzing does absolutely catch bugs. But, like a full search, it’s not an approach an engineer should be proud of relying on. It also doesn’t catch everything. Hardened by fuzzing or not, humans looking at the code will still find bugs. Many of these bugs are going to be security bugs. And these bugs will typically get CVEs and get patched before the details are made public, but some will not see the light of day until they have been exploited in the wild by malicious actors, often sold and bought in dark markets. Infamous government agencies are known to buy a lot of these and stash them.

              It is an ecosystem enabled by the generalized usage of operating systems with fundamentally broken designs, typically derived from or inspired by UNIX, an early seventies operating system that’s largely obsolete by the decades of research into the subject after its release, but is still idolized by many, to the point of blindness.

              1. 5

                Thing is, I’m in complete agreement, right up until the last third of your last sentence. I don’t think UNIX fanboyism (while certainly a real thing) comes close to explaining why the whole world runs on UNIX-derived software. (I mean, except for that one VMS-derived product from Redmond.) I think it’s all about sunk costs, legacy codebases, short horizons, who bears the various maintenance costs, who writes the curricula, and stuff like that. Perverse incentives. Network effects. It’s not a technical problem, and not merely a cultural problem. It’s a sociotechnical and an economic problem, and a wicked one at that.

                1. 1

                  It easily meets some of the “super wicked” requirements, unfortunately.

                2. 3

                  I don’t disagree that security in tech software is basically a rounding error and is otherwise non-existent… We don’t really have any other options though. Our options are pretty much limited to 3 platforms: macOS(UNIX), Linux/BSD(*nix), or Windows.

                  Well in the Mobile space we have Android(linux/*nix) and iOS(UNIX, tho really iOS is probably *nix as I doubt it’s 100% POSIX and certified even if it’s big brother is.)

                  So we are limited to Windows and *nix. those are our choices. Windows security is arguably way better than it was a decade or two ago, but I don’t think anyone would call it’s security inherently better than the *nix variants.

                  The closest we have are probably OpenBSD and HardenedBSD both *nix variants, that even pretend to take security seriously, and both are developer friendly at probably best case.

                  Unless you know of a platform that has anything even remotely resembling standard tools(say a web browser, mail, spreadsheet/office software, database, text editors, programming languages, etc)

                  1. 5

                    Linux has literally hundreds of fuzzing bugs published and unfixed, right now. Go see https://syzkaller.appspot.com/. I believe Windows is way better, because anecdotally this is not the case for Windows.

                    1. 3

                      I agree with you 100% on the hundreds of fuzzing bugs, and likely many are security issues. I was not in any way trying to dispute that.

                      We actually don’t know that’s the case, we can’t look at windows changelogs, we have no idea what MS is doing in those backrooms out in Redmond. We can believe and assume, but that doesn’t prove fact(s). MS has definitely started taking Windows security seriously, but we still regularly and consistently come across RCE’s in Windows. These are just the ones made public, because MS can’t hide them fast enough. I imagine there are many, many more that never get publicly reported.

                      is Windows more secure than Linux? Maybe, but I think you’d be hard pressed to find a consensus among any random group of security researchers.

                    2. 3

                      OpenBSD

                      Is great at what it is.

                      Unless you know of a platform that has anything even remotely resembling standard tools(say a web browser, mail, spreadsheet/office software, database, text editors, programming languages, etc)

                      The sculpt demo at fosdem microkernel devroom was pretty impressive. VMs do work well as a transitional clutch.

                      1. 3

                        Agreed on OpenBSD, I run it on some of my system(s). If OpenBSD has what you need, then it’s a great solution.

                        I’ll have to check out the fosdem demo on sculpt. Thanks!