Title is a bit misleading. Behavior of sshd(8) isn’t really changing per se…it’s just that the session handling logic is now in a dedicated binary.
This change takes the existing session handling logic from the main sshd connection handling binary and moving it. The idea is that once the fork+exec of the session binary is done, that code resident in the resulting address space is specific to session handling and not all the initial connection listening/etc. It results in a smaller surface area for the process dedicated to serving the user session over the network.
How much does this help? If an attacker compromises the first binary, presumably they can do the fork-exec to make new listener processes. Ofc, it is likely harder - so the cost of the attack goes up. Or am I misunderstanding?
Multi-process isn’t free - communication back and forth is a chore, possibly w/ performance and consistency implications. Not to mention higher memory usage for having to duplicate common state.
sshd has been multiprocess for over 20 years. The change here is that the functionality is now split into multiple binaries, so it does fork then execve, rather than just fork. This has a few advantages:
After fork, your address space layout is the same as the parent. After another execve, everything can be rerandomised.
After fork, there’s a danger that you leak secrets from the parent. OpenSSH is careful not to do this but execve clears anything that isn’t explicitly passed and so adds defence in depth.
Reasoning about control flow with tooling is hard because it has a bunch of function pointers that are updated for the different operational modes. I believe that this refactoring should avoid them, which makes them unusable for control-flow hijacking and improves the ability to validate the compartmentalisation of the binaries with automated tooling.
Less code in each process means fewer gadgets for code reuse attacks.
there’s a danger that you leak secrets from the parent
The reexec doesn’t help with secrets.
The listening process has the host keys, but doesn’t do anything with them; the connection process needs them to authenticate the server to the client. So sshd now has a mini protocol for passing the host keys (both private and public) to the connection process.
You can actually already do this with HostKeyAgent, as I understand it, where the keys are kept in escrow by the agent, which might even be backed by a secure element of some sort like a TPM or a PIV/CAC smart card, etc.
The parent listener process is single threaded so making it do all the host authentication would have a bad impact on sshd’s capacity. Among other resource management concerns…
Thanks, I missed that. As I recall, OpenBSD does some relinking to randomise layout within a binary. That would mean the execve of a different binary ensures that anything you manage to leak from the parent is useless when searching for gadgets in the child. In contrast, with just fork and execve of the same binary, you just need to find the library base address (which is typically fairly easy).
Multi-process is harder to do, but it’s the gold standard as far as handling untrusted inputs. An OS process is a real thing supported by hardware at runtime (MMU), not an abstraction created by a programming language.
Defense in depth, etc.
Chrome had a multi-process architecture upon launch in ~2008 - https://scottmccloud.com/googlechrome/ (Chrome used to be very good; let’s leave aside the fact that it’s nearly malware at this point)
Mozilla completed multi-process support in 2018 apparently - https://wiki.mozilla.org/Electrolysis - I think this was a huge engineering effort that took at least 5 years, but it was deemed worth it.
I believe that for 5-10 years Firefox was demonstrably more vulnerable because of this, but I don’t know all the details. Someone here might know more, but browser security was tested empirically, and multi-process is worth it.
Electrolysis functionality hosts, renders, or executes web related content in background child processes which communicate with the “parent” Firefox browser via various ipdl protocols. The two major advantages of this model are security and performance. Security improvements are accomplished through security sandboxing, performance improvements are born out of the fact that multiple processes better leverage available client computing power.
Interestingly they cite performance as an advantage too.
An OS process is a real thing supported by hardware at runtime (MMU), not an abstraction created by a programming language.
A faraday cage is a real thing supported by physics, not an abstraction created by an architecture.
The principal problem is simply that popular programming languages do not feature pervasive capability-safety (whereas the unix programming language sort of does). Sshd, firefox, chrome, and qmail are big blobs of crusty c and c++; bully for them, but that doesn’t mean programming language features are useless.
As far as I remember those are as common, or more common, in real network services than memory safety bugs.
The problem is that the thing that lets you do what the application needs to do is the same thing that lets you write bugs. There’s no silver bullet.
Another way to put it is that sys admins, SREs, and security engineers get paid big bucks for a reason. Process isolation is a common and useful tool in those domains. They have a different view of software that is just as valuable as people writing code.
They deal with all the stuff that the programmers didn’t think of, or don’t have the tools/language to express – they deal with reality.
I don’t see how process isolation helps with html/sql/shell injection where object capabilities would fail. I’m not sure why you call out memory safety; it is a necessary prerequisite to capability safety, but the two are not the same (for example, rust, java, and javascript are popular languages with memory safety but ambient authority). The sole issue is whether the software isolation is more likely to have bugs than hardware isolation.
Software isolation does indeed tend to have more bugs than hardware isolation. But:
Hardware isolation is riddled with bugs too
Alternate approaches to software isolation can improve correctness while reducing the required engineering effort; this is an area I am researching
Outside of contexts where you are literally running untrusted code (browser/js, liblzma), which are the exception rather than the rule, bugs are very difficult to exploit (I’ve not heard of a major vulnerability caused by an erroneously removed automatic bounds check, for instance)
Most importantly: communication between hardware isolation domains is annoying, error-prone, and expensive. Consequently, there is a natural pressure to consolidate. Communication between software isolation domains is as easy as communication within a software isolation domain—literally a function call. This creates a natural pressure to separate and modularise, for the same reason you modularise your code in languages with ambient authority. This more than pays for any bugs in the isolation mechanism itself, and the code will be simpler and more correct.
You are continuing to blindly hammer the ‘defence in depth’ meme and haven’t engaged with any of my points. I don’t think there’s anything meaningful I can say.
It might help limit for example where xz might be linked.
Multi-process isn’t free - communication back and forth is a chore, possibly w/ performance and consistency implications. Not to mention higher memory usage for having to duplicate common state.
OpenBSD does a lot of simple multiprocess things where they seem to have licked these challenges.
OpenBSD does a lot of simple multiprocess things where they seem to have licked these challenges.
I think this bit is worth emphasising. OpenBSD uses process separation for privilege separation extensively, so they have the design pattern very well worked out. (I once added a feature to relayd, and was immediately hit with the need to deal with this. It wasn’t so hard at all!)
It might help limit for example where xz might be linked.
Thinking out loud…
OpenSSH-portable was linked to libsystemd for readiness notification. If this change had been implemented before, then only the listener process would have linked to libsystemd, not the session process. So the session process would not have a transitive dependency on libxz, so libxz would not be able to patch in an authentication backdoor.
So the session process would not have a transitive dependency on libxz, so libxz would not be able to patch in an authentication backdoor.
The parent process has complete control of the child process, in many ways. Simplest example: fork(), setenv("LD_PRELOAD", "...", 1), exec("sshd-session").
Multi-process isn’t free - communication back and forth is a chore, possibly w/ performance and consistency implications. Not to mention higher memory usage for having to duplicate common state.
See my comment above. This doesn’t change the multi-process model whatsoever and afaik there’s no communicating between the session handling part of sshd and the listener part, hence the ability to split out the logic into a separately linked binary.
Multi-process isn’t free - communication back and forth is a chore, possibly w/ performance and consistency implications
Presumably in this case the socket will be passed to the session handling child once, and then the child will use it directly to speak TCP. No IPC required after the initial handshake.
Title is a bit misleading. Behavior of
sshd(8)isn’t really changing per se…it’s just that the session handling logic is now in a dedicated binary.This change takes the existing session handling logic from the main
sshdconnection handling binary and moving it. The idea is that once the fork+exec of the session binary is done, that code resident in the resulting address space is specific to session handling and not all the initial connection listening/etc. It results in a smaller surface area for the process dedicated to serving the user session over the network.That is exactly what I imagined when I read the title, so I’m not sure why it’s misleading..
And here I thought we were getting sshd down to zero binaries and just building it into systemd with everything else.
Too soon.
But systemd uses many separate binaries, so if an ssh implementation was added to systemd, it would certainly be a separate binary.
How much does this help? If an attacker compromises the first binary, presumably they can do the fork-exec to make new listener processes. Ofc, it is likely harder - so the cost of the attack goes up. Or am I misunderstanding?
Multi-process isn’t free - communication back and forth is a chore, possibly w/ performance and consistency implications. Not to mention higher memory usage for having to duplicate common state.
sshd has been multiprocess for over 20 years. The change here is that the functionality is now split into multiple binaries, so it does fork then execve, rather than just fork. This has a few advantages:
The reexec doesn’t help with secrets.
The listening process has the host keys, but doesn’t do anything with them; the connection process needs them to authenticate the server to the client. So sshd now has a mini protocol for passing the host keys (both private and public) to the connection process.
Uninformed question: I wonder why they didn’t just make the connection process a signing oracle, to avoid sending the actual key to the child?
You can actually already do this with
HostKeyAgent, as I understand it, where the keys are kept in escrow by the agent, which might even be backed by a secure element of some sort like a TPM or a PIV/CAC smart card, etc.The parent listener process is single threaded so making it do all the host authentication would have a bad impact on sshd’s capacity. Among other resource management concerns…
The re-exec feature in ssh was commited 20 years ago: http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/usr.bin/ssh/sshd.c.diff?r1=1.293&r2=1.294&f=h
Its just about the different binary now.
Thanks, I missed that. As I recall, OpenBSD does some relinking to randomise layout within a binary. That would mean the execve of a different binary ensures that anything you manage to leak from the parent is useless when searching for gadgets in the child. In contrast, with just fork and execve of the same binary, you just need to find the library base address (which is typically fairly easy).
Thank you - this is quite clear and helpful!
Multi-process is harder to do, but it’s the gold standard as far as handling untrusted inputs. An OS process is a real thing supported by hardware at runtime (MMU), not an abstraction created by a programming language.
Defense in depth, etc.
Chrome had a multi-process architecture upon launch in ~2008 - https://scottmccloud.com/googlechrome/ (Chrome used to be very good; let’s leave aside the fact that it’s nearly malware at this point)
Mozilla completed multi-process support in 2018 apparently - https://wiki.mozilla.org/Electrolysis - I think this was a huge engineering effort that took at least 5 years, but it was deemed worth it.
I believe that for 5-10 years Firefox was demonstrably more vulnerable because of this, but I don’t know all the details. Someone here might know more, but browser security was tested empirically, and multi-process is worth it.
Interestingly they cite performance as an advantage too.
Also see quotes from The Art of Unix Programming and DJB in the middle of this comment - https://lobste.rs/s/uihyvs/backdoor_upstream_xz_liblzma_leading_ssh#c_wgmyzf – multi-process architecture is a classic but apparently somewhat forgotten Unix technique
A faraday cage is a real thing supported by physics, not an abstraction created by an architecture.
The principal problem is simply that popular programming languages do not feature pervasive capability-safety (whereas the unix programming language sort of does). Sshd, firefox, chrome, and qmail are big blobs of crusty c and c++; bully for them, but that doesn’t mean programming language features are useless.
You still want defense in depth - https://en.wikipedia.org/wiki/Defense_in_depth_(computing)
https://csrc.nist.gov/glossary/term/defense_in_depth
The problem is that capability logic is still logic, and logic can have bugs.
For example, HTML injection, SQL injection, and shell injection - https://lobste.rs/s/hru0ib/how_lose_control_your_shell#c_jamktx
As far as I remember those are as common, or more common, in real network services than memory safety bugs.
The problem is that the thing that lets you do what the application needs to do is the same thing that lets you write bugs. There’s no silver bullet.
Another way to put it is that sys admins, SREs, and security engineers get paid big bucks for a reason. Process isolation is a common and useful tool in those domains. They have a different view of software that is just as valuable as people writing code.
They deal with all the stuff that the programmers didn’t think of, or don’t have the tools/language to express – they deal with reality.
I don’t see how process isolation helps with html/sql/shell injection where object capabilities would fail. I’m not sure why you call out memory safety; it is a necessary prerequisite to capability safety, but the two are not the same (for example, rust, java, and javascript are popular languages with memory safety but ambient authority). The sole issue is whether the software isolation is more likely to have bugs than hardware isolation.
Software isolation does indeed tend to have more bugs than hardware isolation. But:
It limits the blast radius from attacks, and also makes targets economically more expensive for the attacker, which discourages them
I recommend looking at some real exploit chains from pwn2own - https://hn.algolia.com/?q=pwn2own
e.g.
https://github.com/saelo/pwn2own2018
They are meaningfully inhibited by process isolation.
Process isolation is just common sense good engineering.
When you send a space shuttle to the moon, you don’t assume all your calculations from the last 10 years were correct. You will have made mistakes.
There were zillions of them, and you have thousands of parts from different suppliers, made at different times (much like software).
So you have multiple redundant checks, redundant systems, simple backup plans, etc.
Same with building a bridge or anything like that – you need different people with different perspectives to make it work
You are continuing to blindly hammer the ‘defence in depth’ meme and haven’t engaged with any of my points. I don’t think there’s anything meaningful I can say.
It might help limit for example where
xzmight be linked.OpenBSD does a lot of simple multiprocess things where they seem to have licked these challenges.
I think this bit is worth emphasising. OpenBSD uses process separation for privilege separation extensively, so they have the design pattern very well worked out. (I once added a feature to
relayd, and was immediately hit with the need to deal with this. It wasn’t so hard at all!)Thinking out loud…
OpenSSH-portable was linked to libsystemd for readiness notification. If this change had been implemented before, then only the listener process would have linked to libsystemd, not the session process. So the session process would not have a transitive dependency on libxz, so libxz would not be able to patch in an authentication backdoor.
The parent process has complete control of the child process, in many ways. Simplest example:
fork(),setenv("LD_PRELOAD", "...", 1),exec("sshd-session").See my comment above. This doesn’t change the multi-process model whatsoever and afaik there’s no communicating between the session handling part of
sshdand the listener part, hence the ability to split out the logic into a separately linked binary.Presumably in this case the socket will be passed to the session handling child once, and then the child will use it directly to speak TCP. No IPC required after the initial handshake.