The general direction of process/component isolation still allows malicious code to be included in the project, and it merely tries to block running malware from achieving its goals. I find sandbox approaches underwhelming and insufficient.
Not all attacks depend on privileged access or out of band communication. Libraries can be malicious through their official safe API: a template engine can inject XSS, data structures/caches/DBs can manipulate the data you store in them. A YAML parser can see when you’re reading a config file and inject insecure settings.
Per-dependency sandbox makes it difficult to reason about security wherever dependencies interact, and the relationship between deps may be obscured by generics and dynamic callbacks. Privileged dependencies get an extra responsibility of having an API that can’t be abused as a sandbox escape gadget for an untrusted dependency.
I don’t see any other way than reviewing source code of all the dependencies, and allow-listing only reviewed deps. Sandboxes can still be there as a defense in depth and to raise the bar for attackers, but it’s necessary to check whether the code does what it claims.
Currently almost nobody reviews code of their deps. Instead we’re effectively operating on a Web of Trust, with way too much trust.
My argument is: code review isn’t bad, but it doesn’t scale. As one proof of that, the xz attack showed that a determined actor can inject bad things right under the noses of multiple people.
Unix processes protect us against many possible attacks that we would probably miss in review – let’s do more of that!
I don’t think xz is a good proof here. The reason why xz succeeded is because this was a multi-year operation where they managed to gain the trust of the community. This allowed them to sneak patches into the source.
We, as in the FOSS community, is very not well equipped to deal with those scenarios.
However, there is a research paper that is a proof that code review is perfectly fine to catch malicious code. The somewhat infamous “Hypocrite Paper” from UMN, where they tried to sneak inn 3-4 malicious code changes into the Linux kernel, was widely reported to have succeeded. At least according to the paper.
However, what turns out is that all of these patches where stopped during code review and never merged. The malicious changes where spotted in one instance, but not mentioned in the paper. The remaining changes where also rejected for various reasons.
However, there is a research paper that is a proof that code review is perfectly fine to catch malicious code.
??? It’s definitely not a proof. This argument has an obvious hole – you don’t have a way to account for the attacks we don’t know about (both in open source, and commercial software)
There can be both completely undetected attacks, and attacks that are known, but only to their victims, and not to the general public. Corporate victims have a very strong incentive to minimize successful attacks.
Also, the world is not static – attackers get more and more sophisticated, and they’ve done so at an alarming rate in the last 10 years
I’m not sure why you would say “merely”. If the attacker didn’t drain your bank account, or murder you based on your location [1], then that’s what you wanted!
If you actually look at real hacks now, they are mostly chained. Sometimes it’s 5, 10, 15 vulnerabilities chained. So reducing the probability of given exploits actually does prevent entire attacks from happening.
The source code approach and the systems approach also aren’t mutually exclusive – it’s going to be both, inevitably.
But the systems approach gets you more results, faster. The code approach should also be pursued, in parallel
——
I just pulled up this page, and it says as of January 2025, 11% of Firefox is Rust, while nearly 40% is C/C++. So that’s after a decade of Rust 1.0 now. And you would imagine Mozilla is more inclined to change their source code than other projects, since that was a main motivation for Rust
Supply chain attacks and RCE vulnerabilities are different. The consequences may be the same, but the defense against them is different. Vulns are coming from untrusted inputs. Usually you can know your attack surface, and can secure it by sandboxing the code exposed to the external inputs, and/or replacing the code with less risky one (like no-unsafe Rust or WUFFS). Such sandbox can be coarse-grained.
OTOH supply chain attacks are insider attacks. There’s no clear trusted/untrusted barrier, because dependencies can be intermingled with everything in your program. If you use a 3rd party dependency for a faster string type, every use of the string may be a potential attack. Sandboxing everything with granularity of individual function calls ends up requiring a dedicated interpreted language and capability-based environment.
And as I’ve mentioned, a supply chain attack doesn’t have to run truly arbitrary code that breaks out of the process isolation, or even its own API isolation. It can compromise the program from the inside by acting subversively via its usual API. A “faster string” dependency could replace strings in your program, and those strings may happen to contain queries, URLs, configs, paths, certificates, usernames and other content that if changed would make your process self-sabotage, based on the confused deputy problem, not memory corruption.
I see what you mean, but sandboxing is still an effective mitigation for attacks that come from both program code (supply chain) and program data inputs
The xz backdoor was a supply chain attack, and there were at least 2 fixes
revert all the changes, obviously
Remove libsystemd as a dynamically linked dependency of OpenSSH, which follows the process isolation philosophy
So the actual fix for a supply chain attack involves process separation
It didn’t seem to be clear to some on lobste.rs, but the existence of this dependency (the lack of separation) is something that caused the xz backdoor
That dependency is a “Jia Tan attractor” – i.e. it caused the unaware xz maintainer to be attacked . If there was instead a dependency on Debian and Red Hat from sshd to libzandy, and the author of libzandy would have been attacked instead
I should add that review alone is insufficient, because whenever you update those tens of millions of lines of deps, you would need to review it again
A major point of process isolation is that it makes review easier!! You now reason about component interactions, not every line of source
That’s exactly what Chrome does, and what Firefox took influence from, mentioned in the linked comments
A multi-process architecture makes code review easier - and you don’t have to trust me, just ask the people who have skin in the game, and actually do the reviews
Google/Chrome actually does review source code of every Rust dependency, and for small updates they review diffs. And they’re using Mozilla’s tool for it.
The nice thing is that they share the list of dependencies they’ve reviewed, so others (who trust Google) can save time and focus on reviewing their other deps.
Yes. Chrome doesn’t require sandboxing for safe Rust code, and the review is sufficient. For example, when they’ve replaced their C++ QR code generator with a Rust library, they moved it out of a sandbox.
I’m not saying sandboxing is bad, but that it’s an incomplete solution. You can’t just sandbox unreviewed code and call it safe (e.g. a sandboxed QR code generator can still generate QR codes that lead to phishing websites). It’s sometimes impossible to even sandbox certain dependencies, e.g. if you adopt a library like hashbrown to get a faster hashmap, putting it in a sandbox would defeat its purpose, since the sandbox overhead would be orders of magnitude more costly than speed improvements from the library. Sandboxing is coarse-grained, and transitive dependencies are often fine-grained.
Coarse-grained sandboxing is a good solution against exploits coming from untrusted inputs. Supply chain attacks are not limited to untrusted inputs, and dependencies may be used in ways that make sandboxing impossible.
Sandboxed xz can’t run arbitrary code in the ssh server which is great, but it can still lie about the data it decompresses. It can take some compressed bytes and say they’ve decompressed to “useradd jiatan”.
Sure I agree you can’t just sandbox things and call it safe. But I’m saying that code review is also an incomplete solution.
I also don’t see that Rust is easier to review for the malicious QR code generator or xz compressor, which you can write in any language
Review is an even more incomplete solution for C/C++. I just found out from freddyb in this thread that Firefox depends on libexpat, which is an extremely crufty archaic piece of C that I was looking at for other reasons last week:
<blink>Expat is UNDERSTAFFED and WITHOUT FUNDING.</blink> !!
!! ~~~~~~~~~~~~ !!
!! The following topics need *additional skilled C developers* to progress !!
!! in a timely manner or at all (loosely ordered by descending priority): !!
So Firefox has presumably reviewed libexpat – the problem is what to do about it. Just because you reviewed old code and found a problem doesn’t mean you have the budget to fix it! Sandboxing is an economical solution, and that’s what they have apparently chosen.
But yes it is incomplete. It is better not to have that dependency. And it is better to review new dependencies and potentially reject them.
If we’re comparing people automatically updating Cargo dependencies to a vendored system like Chrome, then I’d definitely say the vendored system is better.
I don’t really consider that code review … I consider that “noticing if a new author/project has published code inside my binary”. That’s just table stakes, and I’m surprised that anybody does without it!
Thinking about this a bit more, my mental model for the transitive dependency supply chain attack is highly influenced by the way the xz backdoor worked
And that was “out of band” - the liblzma exploit would modify sshd code at runtime, based on the GNU ifunc mechanism
I agree that the kinds of attacks you mention can exist, i.e. not out of band, but through the intended API.
But I don’t have any examples of such transitive dependency supply chain attacks – so I’d be interested in hearing about them
It would be interesting to know what portion of real attacks fall in each category.
In other words, I want to split software up into mutually distrusting dynamic “cells”, like processes, but with the ability to communicate more easily, frequently, and cheaply. The communications between dynamic components would need to be tightly specified, and if a component fails to communicate in exactly the required way, other components should ignore all interactions. Another way of looking at this is that it is a more rigorous enforcement of the age-old principle of least privilege:
A multi-process architecture with the principle of least privilege is exactly why I’ve been working on Oils/YSH for many years :-)
I want a distro / cloud where every program is sandboxed, and programs are more fine-grained. I guess a little like Qubes, but using plain processes rather than Xen (although I guess in recent years these two mechanisms have converged?)
The policy should simply be a shell wrapper around each program/process, which is easy to read and modify
And instead of dynamic linking, you use IPC, because that gives you a separate process.
I have noticed that many programmers don’t understand this idea though, it is sort of a “forgotten” part of Unix
https://news.ycombinator.com/item?id=39906332 - Even after the xz backdoor, somone said “I don’t think it’s acceptable to create a subprocess for what’s effectively a library function call because it comes from a dependency”
which is a bizarre, dogmatic idea IMO … my response talks about Chrome and SSH. Multi-process is traditional, and the gold standard
Multi-process is harder to do, but it’s the gold standard as far as handling untrusted inputs. An OS process is a real thing supported by hardware at runtime (MMU), not an abstraction created by a programming language.
curl dropped a Rust effort, but my response was that even on OpenBSD, there doesn’t seem to be any privilege dropping after the initial setup:
I have also railed against “pyramid-shaped hierarchical dependencies”, e.g. in JS and Rust – when you refactor into processes and IPC, you naturally end up with a flatter dependency graph
The big blockers are:
processes and IPC are one of the most non-portable parts of the OS
process creation is totally different on Windows
sandboxing is totally different between Linux and BSDs
so this is why this belongs at the distro/shell layer – kernels have different mechanisms; distros will have different policies. But authors and distros have to cooperate more.
IPC speed is an issue, as mentioned. But I think this is already natural in the cloud as opposed to desktop, where you already have many processes scattered across machines. IPC is “free” in that case.
you tend to lose static type safety when factoring across processes, and many people understandably don’t like that. There is a pretty hard tradeoff here, but I think the economics favors multi-process. E.g. the curl example - there was not enough engineering bandwidth to take on the Rust dependency, but dropping privileges still seems like it’s not done and could be / should be (?)
The classic argument from Torvalds against microkernels is that shared memory algorithms can be easier to write than distributed algorithms
If you take the microkernel idea too far, then every algorithm becomes a distributed algorithm, and that’s indeed very hard. Kernels are stateful, and distribute state is hard
But Torvalds also says “only distribute when you need to” – in the cloud you already need to, for resource usage/multiplexing, so I think the microkernel-style design makes a lot of sense there
On the other hand, people have described Kubernetes as insecure by default, and that is weird!! I suppose it’s the classic tradeoff of “easy to get started, and also creates a lot of problems that you pay other people to solve for you”
Android (and I believe iOS as well) may be a good case study here, as they do rely very heavily on IPC (binder), with everything running in a secure sandbox.
I especially like how android just reused the UNIX security layer, so that it actually fulfills its original purpose (which is absolutely useless in the desktop use-case with the everything running as the same user model).
Yeah definitely, isolated Android / iOS apps are a big change in the last 20 years, and built on Unix. Though I’d say
you need both isolation and communication, not just isolation. I actually think Android / iOS apps are too isolated, and don’t share enough functionality. You get big download sizes and SDKs this way. I think Android has this weird “Google Play Services” thing that is a sign of a deficiency in sharing
IPC is more natural in the cloud, as mentioned
There is a tension between static linking and dynamic linking and IPC, which is not easy to resolve in general
BTW I think the Unix security model does have a purpose on single-user desktops. On my Debian machine there are a dozen or more users that are not me, so you can run network facing services, and they don’t have access to my files
It is a limited/leaky form of sandboxing, but it’s better than nothing. And yeah Andoid relies on it a lot
Android’s communication definitely has its warts, though I feel it’s more on the API side that is not ergonomic enough, but I don’t think Google play services would be a sign of a deficiency. E.g. GrapheneOS (a security focused android fork) made Google play services a completely normal android service, running in the same strong sandbox as other apps, and most of the stuff just works - it’s just Google cutting corners/exerting their influence on the platform via Play services. The binary sizes are a bit more complex problem, so I wouldn’t conclude that it’s the symptom of a communication-related issue.
As for the UNIX security model, I can’t help but link this XKCD comic: https://xkcd.com/1200/
My gripe is that this security model is not fine-grained enough without a better userspace aware of it going along, like on Android. Running something as your user made sense when we interacted with computers via terminals, and at most a couple (top-level) processes ran under your user, you could easily inspect with jobs. This same model has become anemic the moment the UX changed to a desktop-oriented one - the user has no idea how many processes run under their account, no one-to-one mapping to anything. They might or might not have a single/multiple windows attached to them, have an icon or not, or be completely invisible, all accessing the same file system/privilege. The kernel itself is more than capable of solving it (like creating dynamic users on the fly, like android), it’s a UX-issue.
If information doesn’t need to be contained, but just code, then in-process sandboxing using WASM like rlbox seems like a pretty good choice. Firefox uses rlbox for some… icky dependencies
I was just looking at it last week, wondering why the CPython stdlib still uses this crufty library. It uses a pretty archaic style of hand-written parsing in C, with a history of vulnerabilities, and also very few dev resources
The best place to start here is that the process, or really the MMU, is the only credible hardware security boundary we have. I was very excited about systems like Eros before Spectre, and now I believe this entire area to be worthless until we’ve eliminated memory side channels — and we don’t seem to be any closer.
Until then, the minimal boundary is a process and we need to become very comfortable with trusting everything in that boundary, or accepting the cost of IPC.
This topic is one of the reasons we created the Ecstasy programming language, built around capabilities, injection, and (sub-process) container isolation. It’s still in development, but some of the ideas are explained here.
Fundamentally, the layer cake architecture that has accreted over the past five+ decades is (to quote Monty Python) “no basis for a system of government”. It is a leaky abstraction that trades off security for convenience. And Ben Franklin would say: “Those who would give up convenience, to purchase a little temporary security, deserve neither.”
Fundamentally, all software that you do not completely trust should be provided with only the (exact, if possible) capabilities necessary to perform its function. That means that software must not be able to touch the machine itself; software (e.g. C code) that can perform arbitrary instructions cannot be secured. By transitive closure, that unfortunately means that you can’t trust any of the languages out there today, with a few exceptions in theory e.g. Javascript in a browser, or languages running in a WASM sandbox.
If the software building my website does something clever with passwords, any one of those 181 dependencies could decide that it will scan my processes’ memory for passwords, and send any it finds over the internet to a bad person.
The key is to use an instruction set that does not allow reads and writes from memory. Or any direct access to memory, for that matter. And no ability to talk to a network card. Or a disk. That gets difficult, though, when you’re trying to build software that does anything of value, hence the injection of capabilities.
The actor model, which defines how interacting “things” can communicate with each other. This is a fairly large umbrella term, ranging from languages such as Erlang to various libraries and frameworks; few have security as an explicit aim.
Erlang was definitely one of the inspirations for the model we chose, and for us, security was an explicit aim. He also talks about E in other parts of the article, which has some similar type system constructs to what we designed.
The problem we were setting out to solve was how to easily create secure, manageable, evolvable applications for the serverless cloud, with the goal of being able to reduce datacenter footprints and electricity usage quite dramatically. We’re getting closer by the day.
The general direction of process/component isolation still allows malicious code to be included in the project, and it merely tries to block running malware from achieving its goals. I find sandbox approaches underwhelming and insufficient.
Not all attacks depend on privileged access or out of band communication. Libraries can be malicious through their official safe API: a template engine can inject XSS, data structures/caches/DBs can manipulate the data you store in them. A YAML parser can see when you’re reading a config file and inject insecure settings.
Per-dependency sandbox makes it difficult to reason about security wherever dependencies interact, and the relationship between deps may be obscured by generics and dynamic callbacks. Privileged dependencies get an extra responsibility of having an API that can’t be abused as a sandbox escape gadget for an untrusted dependency.
I don’t see any other way than reviewing source code of all the dependencies, and allow-listing only reviewed deps. Sandboxes can still be there as a defense in depth and to raise the bar for attackers, but it’s necessary to check whether the code does what it claims.
Currently almost nobody reviews code of their deps. Instead we’re effectively operating on a Web of Trust, with way too much trust.
My argument is: code review isn’t bad, but it doesn’t scale. As one proof of that, the xz attack showed that a determined actor can inject bad things right under the noses of multiple people.
Unix processes protect us against many possible attacks that we would probably miss in review – let’s do more of that!
I don’t think
xzis a good proof here. The reason whyxzsucceeded is because this was a multi-year operation where they managed to gain the trust of the community. This allowed them to sneak patches into the source.We, as in the FOSS community, is very not well equipped to deal with those scenarios.
However, there is a research paper that is a proof that code review is perfectly fine to catch malicious code. The somewhat infamous “Hypocrite Paper” from UMN, where they tried to sneak inn 3-4 malicious code changes into the Linux kernel, was widely reported to have succeeded. At least according to the paper.
However, what turns out is that all of these patches where stopped during code review and never merged. The malicious changes where spotted in one instance, but not mentioned in the paper. The remaining changes where also rejected for various reasons.
See Report on University of Minnesota Breach-of-Trust Incident
??? It’s definitely not a proof. This argument has an obvious hole – you don’t have a way to account for the attacks we don’t know about (both in open source, and commercial software)
There can be both completely undetected attacks, and attacks that are known, but only to their victims, and not to the general public. Corporate victims have a very strong incentive to minimize successful attacks.
Also, the world is not static – attackers get more and more sophisticated, and they’ve done so at an alarming rate in the last 10 years
I’m not sure why you would say “merely”. If the attacker didn’t drain your bank account, or murder you based on your location [1], then that’s what you wanted!
If you actually look at real hacks now, they are mostly chained. Sometimes it’s 5, 10, 15 vulnerabilities chained. So reducing the probability of given exploits actually does prevent entire attacks from happening.
The source code approach and the systems approach also aren’t mutually exclusive – it’s going to be both, inevitably.
But the systems approach gets you more results, faster. The code approach should also be pursued, in parallel
——
I just pulled up this page, and it says as of January 2025, 11% of Firefox is Rust, while nearly 40% is C/C++. So that’s after a decade of Rust 1.0 now. And you would imagine Mozilla is more inclined to change their source code than other projects, since that was a main motivation for Rust
https://4e6.github.io/firefox-lang-stats/
https://news.ycombinator.com/item?id=30743577
[1] as has apparently happened on iOS with NSO malware - https://www.theguardian.com/world/2021/jul/18/nso-spyware-used-to-target-family-of-jamal-khashoggi-leaked-data-shows-saudis-pegasus
Supply chain attacks and RCE vulnerabilities are different. The consequences may be the same, but the defense against them is different. Vulns are coming from untrusted inputs. Usually you can know your attack surface, and can secure it by sandboxing the code exposed to the external inputs, and/or replacing the code with less risky one (like no-
unsafeRust or WUFFS). Such sandbox can be coarse-grained.OTOH supply chain attacks are insider attacks. There’s no clear trusted/untrusted barrier, because dependencies can be intermingled with everything in your program. If you use a 3rd party dependency for a faster string type, every use of the string may be a potential attack. Sandboxing everything with granularity of individual function calls ends up requiring a dedicated interpreted language and capability-based environment.
And as I’ve mentioned, a supply chain attack doesn’t have to run truly arbitrary code that breaks out of the process isolation, or even its own API isolation. It can compromise the program from the inside by acting subversively via its usual API. A “faster string” dependency could replace strings in your program, and those strings may happen to contain queries, URLs, configs, paths, certificates, usernames and other content that if changed would make your process self-sabotage, based on the confused deputy problem, not memory corruption.
I see what you mean, but sandboxing is still an effective mitigation for attacks that come from both program code (supply chain) and program data inputs
The xz backdoor was a supply chain attack, and there were at least 2 fixes
Ubuntu 24.04 (and Debian) removed libsystemd from SSH server dependencies - https://news.ycombinator.com/item?id=40018925
So the actual fix for a supply chain attack involves process separation
It didn’t seem to be clear to some on lobste.rs, but the existence of this dependency (the lack of separation) is something that caused the xz backdoor
That dependency is a “Jia Tan attractor” – i.e. it caused the unaware xz maintainer to be attacked . If there was instead a dependency on Debian and Red Hat from sshd to libzandy, and the author of libzandy would have been attacked instead
https://lobste.rs/s/uihyvs/backdoor_upstream_xz_liblzma_leading_ssh#c_wgmyzf
I agree that there is no guarantee – it’s a mitigation, not a complete solution.
Code review is also a mitigation, not a complete solution. You need all kinds of defenses!!
I should add that review alone is insufficient, because whenever you update those tens of millions of lines of deps, you would need to review it again
A major point of process isolation is that it makes review easier!! You now reason about component interactions, not every line of source
That’s exactly what Chrome does, and what Firefox took influence from, mentioned in the linked comments
A multi-process architecture makes code review easier - and you don’t have to trust me, just ask the people who have skin in the game, and actually do the reviews
Google/Chrome actually does review source code of every Rust dependency, and for small updates they review diffs. And they’re using Mozilla’s tool for it.
https://github.com/google/rust-crate-audits
The nice thing is that they share the list of dependencies they’ve reviewed, so others (who trust Google) can save time and focus on reviewing their other deps.
I don’t see how that contradicts what I said
Does the fact that someone reviewed it, Google or otherwise, imply it’s sufficient defense?
Because someone at Google reviewed some version of it, and some diffs, now process separation is not a necessary or useful mitigation?
Yes. Chrome doesn’t require sandboxing for safe Rust code, and the review is sufficient. For example, when they’ve replaced their C++ QR code generator with a Rust library, they moved it out of a sandbox.
I’m not saying sandboxing is bad, but that it’s an incomplete solution. You can’t just sandbox unreviewed code and call it safe (e.g. a sandboxed QR code generator can still generate QR codes that lead to phishing websites). It’s sometimes impossible to even sandbox certain dependencies, e.g. if you adopt a library like hashbrown to get a faster hashmap, putting it in a sandbox would defeat its purpose, since the sandbox overhead would be orders of magnitude more costly than speed improvements from the library. Sandboxing is coarse-grained, and transitive dependencies are often fine-grained.
Coarse-grained sandboxing is a good solution against exploits coming from untrusted inputs. Supply chain attacks are not limited to untrusted inputs, and dependencies may be used in ways that make sandboxing impossible.
Sandboxed xz can’t run arbitrary code in the ssh server which is great, but it can still lie about the data it decompresses. It can take some compressed bytes and say they’ve decompressed to “useradd jiatan”.
Sure I agree you can’t just sandbox things and call it safe. But I’m saying that code review is also an incomplete solution.
I also don’t see that Rust is easier to review for the malicious QR code generator or xz compressor, which you can write in any language
Review is an even more incomplete solution for C/C++. I just found out from freddyb in this thread that Firefox depends on libexpat, which is an extremely crufty archaic piece of C that I was looking at for other reasons last week:
https://github.com/libexpat/libexpat/blob/R_2_6_4/expat/Changes
Gross code - https://github.com/libexpat/libexpat/blob/R_2_6_4/expat/lib/xmlparse.c
So Firefox has presumably reviewed libexpat – the problem is what to do about it. Just because you reviewed old code and found a problem doesn’t mean you have the budget to fix it! Sandboxing is an economical solution, and that’s what they have apparently chosen.
But yes it is incomplete. It is better not to have that dependency. And it is better to review new dependencies and potentially reject them.
If we’re comparing people automatically updating Cargo dependencies to a vendored system like Chrome, then I’d definitely say the vendored system is better.
I don’t really consider that code review … I consider that “noticing if a new author/project has published code inside my binary”. That’s just table stakes, and I’m surprised that anybody does without it!
Thinking about this a bit more, my mental model for the transitive dependency supply chain attack is highly influenced by the way the xz backdoor worked
And that was “out of band” - the liblzma exploit would modify sshd code at runtime, based on the GNU ifunc mechanism
I agree that the kinds of attacks you mention can exist, i.e. not out of band, but through the intended API.
But I don’t have any examples of such transitive dependency supply chain attacks – so I’d be interested in hearing about them
It would be interesting to know what portion of real attacks fall in each category.
A multi-process architecture with the principle of least privilege is exactly why I’ve been working on Oils/YSH for many years :-)
I want a distro / cloud where every program is sandboxed, and programs are more fine-grained. I guess a little like Qubes, but using plain processes rather than Xen (although I guess in recent years these two mechanisms have converged?)
The policy should simply be a shell wrapper around each program/process, which is easy to read and modify
And instead of dynamic linking, you use IPC, because that gives you a separate process.
I have noticed that many programmers don’t understand this idea though, it is sort of a “forgotten” part of Unix
https://news.ycombinator.com/item?id=39906332 - Even after the xz backdoor, somone said “I don’t think it’s acceptable to create a subprocess for what’s effectively a library function call because it comes from a dependency”
which is a bizarre, dogmatic idea IMO … my response talks about Chrome and SSH. Multi-process is traditional, and the gold standard
on Chrome, Firefox, and sshd: https://lobste.rs/s/dmgwip/sshd_8_split_into_multiple_binaries#c_wnaflq
curl dropped a Rust effort, but my response was that even on OpenBSD, there doesn’t seem to be any privilege dropping after the initial setup:
https://lobste.rs/s/4czo0b/dropping_hyper#c_v8uioc
I have also railed against “pyramid-shaped hierarchical dependencies”, e.g. in JS and Rust – when you refactor into processes and IPC, you naturally end up with a flatter dependency graph
The big blockers are:
That sounds awfully like a microkernel.
I don’t disagree with that, but the term “microkernel” is kind of vague and can apply to any number of designs
(Related paper - https://www.usenix.org/legacy/event/hotos05/final_papers_backup/hand/hand_html/index.html )
The classic argument from Torvalds against microkernels is that shared memory algorithms can be easier to write than distributed algorithms
If you take the microkernel idea too far, then every algorithm becomes a distributed algorithm, and that’s indeed very hard. Kernels are stateful, and distribute state is hard
But Torvalds also says “only distribute when you need to” – in the cloud you already need to, for resource usage/multiplexing, so I think the microkernel-style design makes a lot of sense there
On the other hand, people have described Kubernetes as insecure by default, and that is weird!! I suppose it’s the classic tradeoff of “easy to get started, and also creates a lot of problems that you pay other people to solve for you”
Android (and I believe iOS as well) may be a good case study here, as they do rely very heavily on IPC (binder), with everything running in a secure sandbox.
I especially like how android just reused the UNIX security layer, so that it actually fulfills its original purpose (which is absolutely useless in the desktop use-case with the everything running as the same user model).
Yeah definitely, isolated Android / iOS apps are a big change in the last 20 years, and built on Unix. Though I’d say
There is a tension between static linking and dynamic linking and IPC, which is not easy to resolve in general
BTW I think the Unix security model does have a purpose on single-user desktops. On my Debian machine there are a dozen or more users that are not me, so you can run network facing services, and they don’t have access to my files
It is a limited/leaky form of sandboxing, but it’s better than nothing. And yeah Andoid relies on it a lot
Android’s communication definitely has its warts, though I feel it’s more on the API side that is not ergonomic enough, but I don’t think Google play services would be a sign of a deficiency. E.g. GrapheneOS (a security focused android fork) made Google play services a completely normal android service, running in the same strong sandbox as other apps, and most of the stuff just works - it’s just Google cutting corners/exerting their influence on the platform via Play services. The binary sizes are a bit more complex problem, so I wouldn’t conclude that it’s the symptom of a communication-related issue.
As for the UNIX security model, I can’t help but link this XKCD comic: https://xkcd.com/1200/
My gripe is that this security model is not fine-grained enough without a better userspace aware of it going along, like on Android. Running something as your user made sense when we interacted with computers via terminals, and at most a couple (top-level) processes ran under your user, you could easily inspect with
jobs. This same model has become anemic the moment the UX changed to a desktop-oriented one - the user has no idea how many processes run under their account, no one-to-one mapping to anything. They might or might not have a single/multiple windows attached to them, have an icon or not, or be completely invisible, all accessing the same file system/privilege. The kernel itself is more than capable of solving it (like creating dynamic users on the fly, like android), it’s a UX-issue.If information doesn’t need to be contained, but just code, then in-process sandboxing using WASM like rlbox seems like a pretty good choice. Firefox uses rlbox for some… icky dependencies
Wow, I was not aware that Firefox used the Expat XML parsing library:
https://hacks.mozilla.org/2021/12/webassembly-and-back-again-fine-grained-sandboxing-in-firefox-95/
I was just looking at it last week, wondering why the CPython stdlib still uses this crufty library. It uses a pretty archaic style of hand-written parsing in C, with a history of vulnerabilities, and also very few dev resources
https://libexpat.github.io/
https://github.com/libexpat/libexpat/blob/R_2_6_4/expat/Changes - sad warning
That is a big shame … RLBox seems like a good middleground, since I imagine it’s easier to use than out-of-process sandboxing
The best place to start here is that the process, or really the MMU, is the only credible hardware security boundary we have. I was very excited about systems like Eros before Spectre, and now I believe this entire area to be worthless until we’ve eliminated memory side channels — and we don’t seem to be any closer.
Until then, the minimal boundary is a process and we need to become very comfortable with trusting everything in that boundary, or accepting the cost of IPC.
This topic is one of the reasons we created the Ecstasy programming language, built around capabilities, injection, and (sub-process) container isolation. It’s still in development, but some of the ideas are explained here.
Fundamentally, the layer cake architecture that has accreted over the past five+ decades is (to quote Monty Python) “no basis for a system of government”. It is a leaky abstraction that trades off security for convenience. And Ben Franklin would say: “Those who would give up convenience, to purchase a little temporary security, deserve neither.”
Fundamentally, all software that you do not completely trust should be provided with only the (exact, if possible) capabilities necessary to perform its function. That means that software must not be able to touch the machine itself; software (e.g. C code) that can perform arbitrary instructions cannot be secured. By transitive closure, that unfortunately means that you can’t trust any of the languages out there today, with a few exceptions in theory e.g. Javascript in a browser, or languages running in a WASM sandbox.
The key is to use an instruction set that does not allow reads and writes from memory. Or any direct access to memory, for that matter. And no ability to talk to a network card. Or a disk. That gets difficult, though, when you’re trying to build software that does anything of value, hence the injection of capabilities.
Erlang was definitely one of the inspirations for the model we chose, and for us, security was an explicit aim. He also talks about E in other parts of the article, which has some similar type system constructs to what we designed.
The problem we were setting out to solve was how to easily create secure, manageable, evolvable applications for the serverless cloud, with the goal of being able to reduce datacenter footprints and electricity usage quite dramatically. We’re getting closer by the day.