It’s not hard to see why someone personally invested in C++ would feel under attack by guidance from many prominent organisations not to use C++ because of some of its inherent challenges with respect to security.
That said, to borrow from the author: There are only two kinds of languages: the ones people complain about and the ones nobody uses.
The existence of this article suggests people are in fact at least using the other languages!
Rust thought seriously about safety; then did something about it.
Meanwhile if we are lucky the stuff Bjarne is talking about may be available to us sometime in the next decade. Deliberately ignoring and redirecting the concerns like this is in poor taste. C++ did and continues to do it’s job. But it’s stuck with a heavy backwards compatibility problem that may be impossible to overcome. Waiting for it to get the kinds of memory safety guarantees in my own code from C++ that I can already get with Rust seems silly when Rust is right there ready for me to use.
He’s obviously being defensive, but he has a good point about considering other types of safety than just memory. For example, languages without something like RAII don’t have a good way to enforce the cleanup of resources in a timely way — you have to remember to use optional constructs like “defer” to call cleanup code, otherwise the cleanup won’t happen until the GC decides to finalize the owning object, or maybe never. The arbitrary nature of finalizers has been a pain point of Java code for as long as I can remember, when working with any resource that isn’t Pure Java(tm).
Part of the problem though is that:
a) That is a deflection from the entire point of the NSA thing that Stroustrup is ostensibly replying to, which is that almost all serious safety problems are memory safety problems of some kind, which C++ can not seriously mitigate
b) The ‘other forms of safety’ that Stroustrup talks about in the linked letter and positions as being better without actually explicitly arguing for it (what he calls ‘type-and-resource safety’) are also things that C++ just can fundamentally never do - the linked documents are about as serious an approach to getting the described properties for C++ as the smart pointer work was about getting memory safety for C++.
Like, C++ doesn’t have memory safety (and also some related things like ‘iterator invalidation safety’) and fundamentally cannot get it without massively breaking changes, and (specifically) the lack of temporal memory safety and aliasing safety means that their approaches to ‘type-and-resource safety’ will fundamentally do essentially nothing.
This is part of a long pattern of Stroustrup trying to stop any possibility of progress on safety by (amongst other things) using his name, reputation, and any position he is able to get to push for the diversion of effort and resources into big and difficult projects of work that will look like progress but fundamentally cannot ever achieve anything good.
I would argue that memory safety is not a problem of the C++ language, it’s a problem of implementations. Real Soon Now[1], my team is going to be open sourcing a clean slate RTOS targeting a CHERI RISC-V core. The hardware enforces memory safety, the key RTOS components are privilege separated and the platform has been an absolute joy to develop.
Languages like Rust have a stronger guarantee: they check a lot of properties at compile time, which avoids the bugs, rather than simply preventing them from being exploitable. This comes with the caveat that the only data structure that you can express is a tree without dipping into unsafe (either explicitly or via the standard library) and then you need to reason about all of the ways in which those unsafe behaviours interact, without any help from the type system. The Oakland paper from a while back that found a couple of hundred CVEs in Rust crates by looking for three idioms where people misuse things that hide unsafe behind ‘safe’ interfaces suggests that people are not good at this.
The other problem that we’ve seen with Rust is that the compiler trusts the type system. This is fine if all of the code is within the Rust abstract machine, but is a nightmare for systems that interact with an adversary. For example, we saw some code that read a Rust enumeration from an MMIO register and checked that it was in the expected range. The compiler knew that enumerations were type safe so elided the check, introducing a security hole. The correct fix for this is to move the check into the unsafe block that reads from the MMIO register, but that’s the kind of small detail that’s likely to get overlooked (and, in code review, someone may well say ‘this check isn’t doing anything unsafe, can you move it out of the unsafe block?’ because minimising the amount of unsafe code is normally good practice). We need to check a bunch of things at API boundaries to ensure that the caller isn’t doing anything malicious and, in Rust, all of those things would be things that the compiler would want to assume can never happen.
We will probably rewrite a chunk of the code in Rust at some point (once the CHERI support in the Rust compiler is more mature) because there are some nice properties of the language, but we have no illusions that a naive Rust port will be secure.
[1] I think we have all of the approvals sorted now…
The other problem that we’ve seen with Rust is that the compiler trusts the type system. This is fine if all of the code is within the Rust abstract machine, but is a nightmare for systems that interact with an adversary. For example, we saw some code that read a Rust enumeration from an MMIO register and checked that it was in the expected range. The compiler knew that enumerations were type safe so elided the check, introducing a security hole.
Heh, Mickens was right – you can’t just place a LISP book on top of an x86 chip and hope that the hardware learns about lambda calculus (or, in this case, type theory…) by osmosis :-).
This is one of the things I also struggled with back when I thought I knew enough Rust to write a small OS kernel and I was a) definitely wrong and b) somewhat disappointed. I ran into basically the same problem – reading an enum from a memory-mapped config register. As usual, you don’t just read it, because some ranges are valid, some are reserved, some are outright invalid, and of course they’re not all consecutive ranges, so “reading” is really just the happy ending of the range checks you do after a memory fetch.
At the time, I figured the idiomatic way to do it would be via the TryFrom trait, safely mapping config register values to my enum data type. The unsafe code block would read a word and not know/care what it means, then I’d try build the enum separately from that word, which was slower and more boilerplatey than I’d wished but would prevent the compiler from “helping” me along. That looked cool both on paper and on screen, until I tried to support later revisions of the same hardware. Teaching it to deal with different hardware revisions, where valid and reserved ranges differ, turned out to be really stringy and more bug-prone than I would’ve wanted.
My first instinct had been to read and range-check the values in the unsafe block, then build the enum From that, which was at least slightly faster and more condensed (since it was guaranteed to succeed) – or skip enumerating values separately altogether. However, it seemed that was just safety theater, as the conversion was guaranteed to succeed only insofar as the unsafe check was right, thus reducing the whole affair to C with a very verbose typecast syntax.
Frankly, I’m still not sure what the right answer would be, or rather, I haven’t found a satisfactory one yet :(.
It’s hard to say without looking a specific example, but a common trap with Rust enums is that often time you don’t want an enum, you want an integer with a bunch of constants:
struct Flag(u8);
impl Flag {
const Foo: Flag = Flag(0);
const Bar: Flag = Flag(1);
}
I may be misunderstanding some details about how this works, but in the context of interfacing with the underlying hardware I think I generally want both: a way to represent related values (so a struct Flag(u8) with a bunch of constant values) and an enumerated set of valid flag values, so that I can encode range checks in TryFrom/TryInto. Otherwise, if I do this:
let flags = cfg.get_flags(dev.id)?
where
fn get_flags(&self, id: DeviceId) -> Result<Flag>
I will, sooner or later, write get_flags in terms of reading a byte from a corrupted flash device and I’ll wind up trying to write Flag(42) to a config register that only takes Flag::Foo or Flag::Bar.
Having both means that my config read/write chain looks something like this: I get a byte from storage, I build my enum Flag instance based on it. If that worked, I now know I have a valid flag setting that I can pass around, modulo TryFrom<u8> implementation bugs. To write it, I hand it over to a function which tl;dr will turn my flags into an u8 and yell it on the bus. If that function worked, I know it passed a valid flag, modulo TryInto<u8> implementation bugs.
Otherwise I need to hope that my read_config function checked the byte to make sure it’s a valid flag, and that my set_config function checked the flag I got before bus_writeing it, and I do not want to be that optimistic :(.
I would argue that memory safety is not a problem of the C++ language, it’s a problem of implementations. Real Soon Now[1], my team is going to be open sourcing a clean slate RTOS targeting a CHERI RISC-V core. The hardware enforces memory safety, the key RTOS components are privilege separated and the platform has been an absolute joy to develop.
That’s cool. I’m quite excited for CHERI. My question is this - when you do run into a memory safety issue with CHERI what is the dev experience? In Rust you get a nice compiler error, which feels much “cheaper” to handle. With CHERI it feels like it would be a lot more expensive to have these bugs show up so late - although wayyyyyyy better than having them show up and be exploitable.
The Oakland paper from a while back that found a couple of hundred CVEs in Rust crates by looking for three idioms where people misuse things that hide unsafe behind ‘safe’ interfaces suggests that people are not good at this.
For sure. Rudra is awesome. Unsafe is hard. Thankfully, the tooling around unsafe for Rust is getting pretty insane - miri, rudra, fuzzing, etc. I guess it’s probably worth noting that the paper is actually very positive about Rust’s safety.
My opinion, and what I have observed, is that while there will be unsafety in rust it’s quite hard to exploit it. The bug density tends to be very low, low enough that chaining them together can be tough.
This is fine if all of the code is within the Rust abstract machine, but is a nightmare for systems that interact with an adversary. For example, we saw some code that read a Rust enumeration from an MMIO register and checked that it was in the expected range. The compiler knew that enumerations were type safe so elided the check, introducing a security hole.
I don’t understand this. What are you referring to with regards to “an adversary”. Did an attacker already have full code execution and then leveraged a lack of check elsewhere? Otherwise if the compiler eliminated the check it shouldn’t be possible to reach that without unsafe elsewhere. Or did you do something like cast the enum from a value without checking? I don’t really understand.
We need to check a bunch of things at API boundaries to ensure that the caller isn’t doing anything malicious and, in Rust, all of those things would be things that the compiler would want to assume can never happen.
That’s cool. I’m quite excited for CHERI. My question is this - when you do run into a memory safety issue with CHERI what is the dev experience? In Rust you get a nice compiler error, which feels much “cheaper” to handle. With CHERI it feels like it would be a lot more expensive to have these bugs show up so late - although wayyyyyyy better than having them show up and be exploitable.
It’s all run-time trapping. This is, I agree, much worse than catching things at compile time. On the other hand, running existing code is a better developer experience than asking people to rewrite it. If you are writing new code, please use a memory-safe (and, ideally, type-safe language).
I don’t understand this. What are you referring to with regards to “an adversary”.
One of the problems with Rust is that all non-Rust code is intrinsically unsafe. For example, in our model, we can pull in things like the FreeRTOS network stack, mBedTLS, and the Microvium JavaScript VM without having to rewrite them. In Rust, any call to these is unsafe. If an attacker compromises them, then it’s game over for Rust (this is no different from C/C++, so Rust at least gives you attack-surface reduction).
If a Rust component is providing a service to untrusted components then it can’t trust any of its arguments. You (the programmer) still need to explicitly check everything.
Did an attacker already have full code execution and then leveraged a lack of check elsewhere?
This case didn’t have an active adversary in software. It had an attacker who could cause power glitches that caused a memory-mapped device to return an invalid value from a memory-mapped register. This is a fairly common threat model for embedded devices. If the out-of-range value is then used to index something else, you can leverage it to gain memory corruption and possibly hijack control flow and then you can use other paths to get arbitrary code execution.
I’m just not understanding who this attacker is.
Everyone else who provides any code that ends up in your program, including authors of libraries that you use. Supply chain vulnerabilities are increasingly important.
On the other hand, running existing code is a better developer experience than asking people to rewrite it.
For sure. Mitigations like CHERI are critical for that reason - we can’t just say “well you should have used Rust”, we need practical ways to make all code safer. 100%.
If an attacker compromises them,
So basically the attacker has full code execution over the process. Yeah, unless you have a virtual machine (or hardware support) I don’t think that’s a problem you can solve in Rust or any other language. At that point the full address space is open to the attacker.
It had an attacker who could cause power glitches that caused a memory-mapped device to return an invalid value from a memory-mapped register.
This sounds like rowhammer, which I can’t imagine any language ever being resistant to. That has to happen at a hardware level - I think that’s your point? Because even if the compiler had inserted the check, if the attacker here can flip arbitrary bits I don’t think it matters.
Supply chain vulnerabilities are increasingly important.
For sure, and I think perhaps we’re on the same page here - any language without a virtual machine / hardware integration is going to suffer from these problems.
So basically the attacker has full code execution over the process
That’s the attacker’s goal. Initially, the attacker has the ability to corrupt some data. They may have the ability to execute arbitrary code in some sandboxed environment. They are trying to get arbitrary code outside of the sandbox.
This sounds like rowhammer, which I can’t imagine any language ever being resistant to. That has to happen at a hardware level - I think that’s your point? Because even if the compiler had inserted the check, if the attacker here can flip arbitrary bits I don’t think it matters.
You get equivalent issues from converting an integer from C code into an enumeration where an attacker is able to do something like a one-byte overwrite and corrupt the value.
Typically, attacks start with something small, which can be a single byte corruption. They then chain together exploits until they have full arbitrary code execution. The problem is when the Rust compiler elides some of the checks that someone has explicitly inserted defensively to protect against this kind of thing. Note that this isn’t unique to Rust. C/C++ also has this problem to a degree (for example, eliding NULL checks if you accidentally dereference the pointer on both paths) but it’s worse in Rust because there’s more in type-safe Rust that the language abstract machine guarantees in C.
You get equivalent issues from converting an integer from C code into an enumeration where an attacker is able to do something like a one-byte overwrite and corrupt the value.
I’m confused, you mean copying the int into a rust enum too narrow for it?
The problem is when the Rust compiler elides some of the checks that someone has explicitly inserted defensively to protect against this kind of thing.
Are you referring to checks at the boundary, or checks far behind it?
I’m confused, you mean copying the int into a rust enum too narrow for it?
No, the flow is a C function returning an enumeration that you coerce into a Rust enumeration that holds the same values. An attacker is able to trigger a one-byte overwrite in the C code that means that the value returned is not actually a valid value in that enumeration range. The Rust programmer doesn’t trust the C code and so inserts an explicit check that the enumeration is a valid value. The Rust compiler knows that enumerations are type safe and so elides the check. Now you have a way for an attacker with a one-byte overwrite in C code to start a control-flow hijacking attack on the Rust code.
Are you referring to checks at the boundary, or checks far behind it?
Checks in the trusted (Rust) code, outside of unsafe blocks.
The correct fix for this is to move the check into the unsafe block that reads from the MMIO register, but that’s the kind of small detail that’s likely to get overlooked (and, in code review, someone may well say ‘this check isn’t doing anything unsafe, can you move it out of the unsafe block?’ because minimising the amount of unsafe code is normally good practice).
Nit: moving code out of an unsafe block will never affect its semantics - the only thing it might do is stop the code from compiling.
Unsafe is a magic keyword that’s required when calling certain functions, dereferencing raw pointers, and accessing mutable statics (there might be a few other rare ones I’m forgetting). Beyond allowing those three operations to compile, it doesn’t affect semantics; if a statement/expression compiles without an unsafe block (i.e. it doesn’t use any of those three operations), wrapping it in an unsafe block will not change your program.
The correct fix here is to check the value is within range before casting it to the enum (incidentally, an operation that requires an unsafe block).
All that being said, your broader point is true: Rust’s stricter rules mean that it may well be easier to write undefined behavior in unsafe Rust than C.
The Oakland paper from a while back that found a couple of hundred CVEs in Rust crates by looking for three idioms where people misuse things that hide unsafe behind ‘safe’ interfaces suggests that people are not good at this.
For example, we saw some code that read a Rust enumeration from an MMIO register and checked that it was in the expected range. The compiler knew that enumerations were type safe so elided the check, introducing a security hole.
Does the compiler at least emit a warning like “this comparison is always true” that could signal that one’s doing this incorrectly?
Languages like Rust have a stronger guarantee: they check a lot of properties at compile time, which avoids the bugs, rather than simply preventing them from being exploitable. This comes with the caveat that the only data structure that you can express is a tree without dipping into unsafe
(Tracing) gc has no trouble with actual graphs, and still prevents all those nasty bugs by construction.
The other problem that we’ve seen with Rust is that the compiler trusts the type system
Yes—I am still waiting for capability-safety to be table stakes. Basically no one should get the ‘unsafe’ god-capability.
(Tracing) gc has no trouble with actual graphs, and still prevents all those nasty bugs by construction.
But it does have problems with tail latency and worst-case memory overhead, which makes it unfeasible in the kind of scenarios where you should consider C++. If neither of those are constraints for your problem domain, C++ is absolutely the wrong tool for the job.
Yes—I am still waiting for capability-safety to be table stakes. Basically no one should get the ‘unsafe’ god-capability.
Unfortunately, in Rust, core standard-library things like RC depend on unsafe and so everything would need to hold the capability to perform unsafe to be able to pass it down to those crates, unless you have a compile-time capability model at the module level.
Unsafe can be switched of at the module level and the module is indeed also the boundary of unsafe in Rust.
A mistake with unsafe may be triggered from the outside, but a correct unsafe implementation is well-encapsulated. That very effectively reduces the scope of review.
I basically agree with you! I haven’t been aware of these tendencies of his, but I’m not surprised.
But I think the types of safety provided by RIAA are valuable too. My day-job these days is mostly coding in Go and I miss RIAA a lot. Just yesterday I had to debug a deadlock produced by a high-level resource issue (failure to return an object to a pool) that wouldn’t have occurred in C++ because I would have used some RIAA mechanism to return it.
a) That is a deflection from the entire point of the NSA thing that Stroustrup is ostensibly replying to, which is that almost all serious safety problems are memory safety problems of some kind, which C++ can not seriously mitigate
Thank you. I’m soooo sick of seeing “but Rust doesn’t solve all forms of safety so is it even safe?”. “Rust is safe” means “Rust is memory safe”. That’s a big deal, memory safety vulnerabilities are highly prevalent and absolutely worst-case.
That is a deflection from the entire point of the NSA thing that Stroustrup is ostensibly replying to, which is that almost all serious safety problems are memory safety problems of some kind,
That would have to be heavily qualified to a domain – otherwise I’d say it’s just plain untrue.
String injection like HTML / SQL / Shell are arguably worse problems in the wide spectrum of the computing ecosystem, in addition to plain mistakes like logic errors and misconfiguration.
As far as I can tell, none of these relate to memory safety:
It seems pretty clear that certain groups of programmers are never going to move to “X-safe” languages (for various values of X) unless they are forced to by their employers or customers.
In a vast amount of programming situations the users who are harmed by lack of safety are not the people paying the programmers, and/or have no more-secure options available to them and cannot vote with their wallets. It’s “accept this buggy software that might wreck your life one day without recourse” vs “don’t see your nephew’s photos”.
Additionally, most employers are not going to do this unless they are forced to by regulation, or lawsuits from big customers.
It seems to me we realistically have two options to drag the world kicking and screaming into more secure software: regulation from big brother, or allow big customers to sue big software providers when things go bad, and force the industry to find and move to better development systems and processes.
Regulation will be a nightmare. We can’t have the government telling us who is allowed to program and when, and what tools they are allowed to use. Those decisions as to what is allowable will all be wrong, and hijacked by corruption, grift, and ass-covering, and I don’t think it’s really hyperbolic to say it would have a good chance of destroying the local industry and probably the entire economy of the first state to seriously attempt it.
I can’t see any way for this to reach a solution other than making companies liable for their products again, even if those products are bits.
I don’t want to dunk too much on Bjarne here (he’s obviously a very smart guy), but in 2 pages he (or the person typing the document) manages to mangle the ISO date in the header (“2022-12-6”), misuse possessive its, and not consistently format what I assume are C++ research notes (“P2739R0”, “P2410r0”). It’s not the best advertisement for the quality of C++…
It’s not hard to see why someone personally invested in C++ would feel under attack by guidance from many prominent organisations not to use C++ because of some of its inherent challenges with respect to security.
That said, to borrow from the author: There are only two kinds of languages: the ones people complain about and the ones nobody uses.
The existence of this article suggests people are in fact at least using the other languages!
Rust thought seriously about safety; then did something about it.
Meanwhile if we are lucky the stuff Bjarne is talking about may be available to us sometime in the next decade. Deliberately ignoring and redirecting the concerns like this is in poor taste. C++ did and continues to do it’s job. But it’s stuck with a heavy backwards compatibility problem that may be impossible to overcome. Waiting for it to get the kinds of memory safety guarantees in my own code from C++ that I can already get with Rust seems silly when Rust is right there ready for me to use.
He’s obviously being defensive, but he has a good point about considering other types of safety than just memory. For example, languages without something like RAII don’t have a good way to enforce the cleanup of resources in a timely way — you have to remember to use optional constructs like “defer” to call cleanup code, otherwise the cleanup won’t happen until the GC decides to finalize the owning object, or maybe never. The arbitrary nature of finalizers has been a pain point of Java code for as long as I can remember, when working with any resource that isn’t Pure Java(tm).
Part of the problem though is that: a) That is a deflection from the entire point of the NSA thing that Stroustrup is ostensibly replying to, which is that almost all serious safety problems are memory safety problems of some kind, which C++ can not seriously mitigate b) The ‘other forms of safety’ that Stroustrup talks about in the linked letter and positions as being better without actually explicitly arguing for it (what he calls ‘type-and-resource safety’) are also things that C++ just can fundamentally never do - the linked documents are about as serious an approach to getting the described properties for C++ as the smart pointer work was about getting memory safety for C++.
Like, C++ doesn’t have memory safety (and also some related things like ‘iterator invalidation safety’) and fundamentally cannot get it without massively breaking changes, and (specifically) the lack of temporal memory safety and aliasing safety means that their approaches to ‘type-and-resource safety’ will fundamentally do essentially nothing.
This is part of a long pattern of Stroustrup trying to stop any possibility of progress on safety by (amongst other things) using his name, reputation, and any position he is able to get to push for the diversion of effort and resources into big and difficult projects of work that will look like progress but fundamentally cannot ever achieve anything good.
I would argue that memory safety is not a problem of the C++ language, it’s a problem of implementations. Real Soon Now[1], my team is going to be open sourcing a clean slate RTOS targeting a CHERI RISC-V core. The hardware enforces memory safety, the key RTOS components are privilege separated and the platform has been an absolute joy to develop.
Languages like Rust have a stronger guarantee: they check a lot of properties at compile time, which avoids the bugs, rather than simply preventing them from being exploitable. This comes with the caveat that the only data structure that you can express is a tree without dipping into unsafe (either explicitly or via the standard library) and then you need to reason about all of the ways in which those unsafe behaviours interact, without any help from the type system. The Oakland paper from a while back that found a couple of hundred CVEs in Rust crates by looking for three idioms where people misuse things that hide unsafe behind ‘safe’ interfaces suggests that people are not good at this.
The other problem that we’ve seen with Rust is that the compiler trusts the type system. This is fine if all of the code is within the Rust abstract machine, but is a nightmare for systems that interact with an adversary. For example, we saw some code that read a Rust enumeration from an MMIO register and checked that it was in the expected range. The compiler knew that enumerations were type safe so elided the check, introducing a security hole. The correct fix for this is to move the check into the unsafe block that reads from the MMIO register, but that’s the kind of small detail that’s likely to get overlooked (and, in code review, someone may well say ‘this check isn’t doing anything unsafe, can you move it out of the unsafe block?’ because minimising the amount of unsafe code is normally good practice). We need to check a bunch of things at API boundaries to ensure that the caller isn’t doing anything malicious and, in Rust, all of those things would be things that the compiler would want to assume can never happen.
We will probably rewrite a chunk of the code in Rust at some point (once the CHERI support in the Rust compiler is more mature) because there are some nice properties of the language, but we have no illusions that a naive Rust port will be secure.
[1] I think we have all of the approvals sorted now…
Heh, Mickens was right – you can’t just place a LISP book on top of an x86 chip and hope that the hardware learns about lambda calculus (or, in this case, type theory…) by osmosis :-).
This is one of the things I also struggled with back when I thought I knew enough Rust to write a small OS kernel and I was a) definitely wrong and b) somewhat disappointed. I ran into basically the same problem – reading an enum from a memory-mapped config register. As usual, you don’t just read it, because some ranges are valid, some are reserved, some are outright invalid, and of course they’re not all consecutive ranges, so “reading” is really just the happy ending of the range checks you do after a memory fetch.
At the time, I figured the idiomatic way to do it would be via the
TryFrom
trait, safely mapping config register values to my enum data type. The unsafe code block would read a word and not know/care what it means, then I’d try build the enum separately from that word, which was slower and more boilerplatey than I’d wished but would prevent the compiler from “helping” me along. That looked cool both on paper and on screen, until I tried to support later revisions of the same hardware. Teaching it to deal with different hardware revisions, where valid and reserved ranges differ, turned out to be really stringy and more bug-prone than I would’ve wanted.My first instinct had been to read and range-check the values in the unsafe block, then build the enum
From
that, which was at least slightly faster and more condensed (since it was guaranteed to succeed) – or skipenum
erating values separately altogether. However, it seemed that was just safety theater, as the conversion was guaranteed to succeed only insofar as the unsafe check was right, thus reducing the whole affair to C with a very verbose typecast syntax.Frankly, I’m still not sure what the right answer would be, or rather, I haven’t found a satisfactory one yet :(.
Haha great quote … the way I phrase it is that is “When models and reality collide, reality wins”
Type systems are models, not reality … I see a lot of solipsistic views of software that mistake the map for the territory
Previous comment on “the world”: https://lobste.rs/s/9rrxbh/on_types#c_qanywm
Not coincidentally, it also links to an article about interfacing Rust with hardware
It’s hard to say without looking a specific example, but a common trap with Rust
enum
s is that often time you don’t want an enum, you want an integer with a bunch of constants:I may be misunderstanding some details about how this works, but in the context of interfacing with the underlying hardware I think I generally want both: a way to represent related values (so a
struct Flag(u8)
with a bunch of constant values) and anenum
erated set of valid flag values, so that I can encode range checks inTryFrom
/TryInto
. Otherwise, if I do this:where
I will, sooner or later, write
get_flags
in terms of reading a byte from a corrupted flash device and I’ll wind up trying to writeFlag(42)
to a config register that only takesFlag::Foo
orFlag::Bar
.Having both means that my config read/write chain looks something like this: I get a byte from storage, I build my
enum Flag
instance based on it. If that worked, I now know I have a valid flag setting that I can pass around, moduloTryFrom<u8>
implementation bugs. To write it, I hand it over to a function which tl;dr will turn my flags into anu8
and yell it on the bus. If that function worked, I know it passed a valid flag, moduloTryInto<u8>
implementation bugs.Otherwise I need to hope that my
read_config
function checked the byte to make sure it’s a valid flag, and that myset_config
function checked the flag I got beforebus_write
ing it, and I do not want to be that optimistic :(.That’s cool. I’m quite excited for CHERI. My question is this - when you do run into a memory safety issue with CHERI what is the dev experience? In Rust you get a nice compiler error, which feels much “cheaper” to handle. With CHERI it feels like it would be a lot more expensive to have these bugs show up so late - although wayyyyyyy better than having them show up and be exploitable.
For sure. Rudra is awesome. Unsafe is hard. Thankfully, the tooling around unsafe for Rust is getting pretty insane - miri, rudra, fuzzing, etc. I guess it’s probably worth noting that the paper is actually very positive about Rust’s safety.
My opinion, and what I have observed, is that while there will be unsafety in rust it’s quite hard to exploit it. The bug density tends to be very low, low enough that chaining them together can be tough.
I don’t understand this. What are you referring to with regards to “an adversary”. Did an attacker already have full code execution and then leveraged a lack of check elsewhere? Otherwise if the compiler eliminated the check it shouldn’t be possible to reach that without
unsafe
elsewhere. Or did you do something like cast the enum from a value without checking? I don’t really understand.I’m just not understanding who this attacker is.
It’s all run-time trapping. This is, I agree, much worse than catching things at compile time. On the other hand, running existing code is a better developer experience than asking people to rewrite it. If you are writing new code, please use a memory-safe (and, ideally, type-safe language).
One of the problems with Rust is that all non-Rust code is intrinsically unsafe. For example, in our model, we can pull in things like the FreeRTOS network stack, mBedTLS, and the Microvium JavaScript VM without having to rewrite them. In Rust, any call to these is unsafe. If an attacker compromises them, then it’s game over for Rust (this is no different from C/C++, so Rust at least gives you attack-surface reduction).
If a Rust component is providing a service to untrusted components then it can’t trust any of its arguments. You (the programmer) still need to explicitly check everything.
This case didn’t have an active adversary in software. It had an attacker who could cause power glitches that caused a memory-mapped device to return an invalid value from a memory-mapped register. This is a fairly common threat model for embedded devices. If the out-of-range value is then used to index something else, you can leverage it to gain memory corruption and possibly hijack control flow and then you can use other paths to get arbitrary code execution.
Everyone else who provides any code that ends up in your program, including authors of libraries that you use. Supply chain vulnerabilities are increasingly important.
For sure. Mitigations like CHERI are critical for that reason - we can’t just say “well you should have used Rust”, we need practical ways to make all code safer. 100%.
So basically the attacker has full code execution over the process. Yeah, unless you have a virtual machine (or hardware support) I don’t think that’s a problem you can solve in Rust or any other language. At that point the full address space is open to the attacker.
This sounds like rowhammer, which I can’t imagine any language ever being resistant to. That has to happen at a hardware level - I think that’s your point? Because even if the compiler had inserted the check, if the attacker here can flip arbitrary bits I don’t think it matters.
For sure, and I think perhaps we’re on the same page here - any language without a virtual machine / hardware integration is going to suffer from these problems.
That’s the attacker’s goal. Initially, the attacker has the ability to corrupt some data. They may have the ability to execute arbitrary code in some sandboxed environment. They are trying to get arbitrary code outside of the sandbox.
You get equivalent issues from converting an integer from C code into an enumeration where an attacker is able to do something like a one-byte overwrite and corrupt the value.
Typically, attacks start with something small, which can be a single byte corruption. They then chain together exploits until they have full arbitrary code execution. The problem is when the Rust compiler elides some of the checks that someone has explicitly inserted defensively to protect against this kind of thing. Note that this isn’t unique to Rust. C/C++ also has this problem to a degree (for example, eliding NULL checks if you accidentally dereference the pointer on both paths) but it’s worse in Rust because there’s more in type-safe Rust that the language abstract machine guarantees in C.
I don’t really agree with this premise but that’s fine.
I’m confused, you mean copying the int into a rust enum too narrow for it?
Are you referring to checks at the boundary, or checks far behind it?
No, the flow is a C function returning an enumeration that you coerce into a Rust enumeration that holds the same values. An attacker is able to trigger a one-byte overwrite in the C code that means that the value returned is not actually a valid value in that enumeration range. The Rust programmer doesn’t trust the C code and so inserts an explicit check that the enumeration is a valid value. The Rust compiler knows that enumerations are type safe and so elides the check. Now you have a way for an attacker with a one-byte overwrite in C code to start a control-flow hijacking attack on the Rust code.
Checks in the trusted (Rust) code, outside of
unsafe
blocks.Nit: moving code out of an unsafe block will never affect its semantics - the only thing it might do is stop the code from compiling.
Unsafe is a magic keyword that’s required when calling certain functions, dereferencing raw pointers, and accessing mutable statics (there might be a few other rare ones I’m forgetting). Beyond allowing those three operations to compile, it doesn’t affect semantics; if a statement/expression compiles without an unsafe block (i.e. it doesn’t use any of those three operations), wrapping it in an unsafe block will not change your program.
The correct fix here is to check the value is within range before casting it to the enum (incidentally, an operation that requires an unsafe block).
All that being said, your broader point is true: Rust’s stricter rules mean that it may well be easier to write undefined behavior in unsafe Rust than C.
Can you share a link to that paper?
I misremembered, it was at SOSP, not Oakland (it was covered here). The title was:
Rudra: Finding Memory Safety Bugs in Rust at the Ecosystem Scale
Thanks! That one I do remember 😄
Does the compiler at least emit a warning like “this comparison is always true” that could signal that one’s doing this incorrectly?
(Tracing) gc has no trouble with actual graphs, and still prevents all those nasty bugs by construction.
Yes—I am still waiting for capability-safety to be table stakes. Basically no one should get the ‘unsafe’ god-capability.
But it does have problems with tail latency and worst-case memory overhead, which makes it unfeasible in the kind of scenarios where you should consider C++. If neither of those are constraints for your problem domain, C++ is absolutely the wrong tool for the job.
Unfortunately, in Rust, core standard-library things like
RC
depend on unsafe and so everything would need to hold the capability to performunsafe
to be able to pass it down to those crates, unless you have a compile-time capability model at the module level.Unsafe can be switched of at the module level and the module is indeed also the boundary of unsafe in Rust.
A mistake with unsafe may be triggered from the outside, but a correct unsafe implementation is well-encapsulated. That very effectively reduces the scope of review.
I basically agree with you! I haven’t been aware of these tendencies of his, but I’m not surprised.
But I think the types of safety provided by RIAA are valuable too. My day-job these days is mostly coding in Go and I miss RIAA a lot. Just yesterday I had to debug a deadlock produced by a high-level resource issue (failure to return an object to a pool) that wouldn’t have occurred in C++ because I would have used some RIAA mechanism to return it.
Thank you. I’m soooo sick of seeing “but Rust doesn’t solve all forms of safety so is it even safe?”. “Rust is safe” means “Rust is memory safe”. That’s a big deal, memory safety vulnerabilities are highly prevalent and absolutely worst-case.
The whole post by him is really ignorant.
That would have to be heavily qualified to a domain – otherwise I’d say it’s just plain untrue.
String injection like HTML / SQL / Shell are arguably worse problems in the wide spectrum of the computing ecosystem, in addition to plain mistakes like logic errors and misconfiguration.
As far as I can tell, none of these relate to memory safety:
https://www.checkpoint.com/cyber-hub/cloud-security/what-is-application-security-appsec/owasp-top-10-vulnerabilities/
It seems pretty clear that certain groups of programmers are never going to move to “X-safe” languages (for various values of X) unless they are forced to by their employers or customers.
In a vast amount of programming situations the users who are harmed by lack of safety are not the people paying the programmers, and/or have no more-secure options available to them and cannot vote with their wallets. It’s “accept this buggy software that might wreck your life one day without recourse” vs “don’t see your nephew’s photos”.
Additionally, most employers are not going to do this unless they are forced to by regulation, or lawsuits from big customers.
It seems to me we realistically have two options to drag the world kicking and screaming into more secure software: regulation from big brother, or allow big customers to sue big software providers when things go bad, and force the industry to find and move to better development systems and processes.
Regulation will be a nightmare. We can’t have the government telling us who is allowed to program and when, and what tools they are allowed to use. Those decisions as to what is allowable will all be wrong, and hijacked by corruption, grift, and ass-covering, and I don’t think it’s really hyperbolic to say it would have a good chance of destroying the local industry and probably the entire economy of the first state to seriously attempt it.
I can’t see any way for this to reach a solution other than making companies liable for their products again, even if those products are bits.
I don’t want to dunk too much on Bjarne here (he’s obviously a very smart guy), but in 2 pages he (or the person typing the document) manages to mangle the ISO date in the header (“2022-12-6”), misuse possessive its, and not consistently format what I assume are C++ research notes (“P2739R0”, “P2410r0”). It’s not the best advertisement for the quality of C++…
A related doc: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2759r0.pdf