We see no realistic path for an evolution of C++ into a language with rigorous memory safety guarantees that include temporal safety.
I’m really glad they’re calling this out. There is a lot of confusing messaging from WG21 members about the idea of adding different profiles to the language with different safety goals. But none of that has even tried to address the elephant in the room - temporal safety is a hard problem, and we don’t know how to solve it for C++.
The ideas we have are roughly:
You could add explicit lifetimes, like Rust has, but that would be a severe change to the language.
You could add GC, which would cost a lot in terms of performance and interoperability.
You could adopt CHERI (along with one of the extensions for temporal safety, e.g. Cornucopia), which would mean losing compatibility with all your production hardware.
In any case, there is something really important you’re leaving behind, which is likely to encounter a lot of resistance from actual C++ users.
Temporal safety is what I’m honestly most excited about with Rust. GC is efficient enough for many many use cases, but I’m not sure that shared-nothing concurrency is.
You could adopt CHERI […] which would mean losing compatibility with all your production hardware.
I don’t think you would loose compatibility. A codebase that works with CHERI should be able to be compiled to existing targets perfectly fine (expect for maybe some low-level routines). This seems like a pretty decent migration path as you will slowly gain safety as you upgrade hardware.
Of course there are some downsides:
Arguably if you have 1 non-CHERI machine the attacker can just keep retying until their exploit works on the right machine. (Although it will be a lot louder)
Runtime checking may catch less bugs than compile-time verification.
But honestly Rust also does a lot of run-time checking and not rewriting all of your code seems a great upgrade path. To me this seems like the best path for C++ code. Adding better lifetime checking in the language is also great but it seems like any comprehensive system there will be a huge adoption cost. If you use CHERI to convert vulnerabilities into bugs then focus the language checking on reducing the number of bugs that seems a very practical solution.
The existing hardware does not support CHERI, so you do loose compatibility with that hardware.
Replacing hardware is always more expensive than replacing software. So if we argue that it would be prohibitively expensive to replace all C++ software with other software, we can surely not replace all the hardware running that software either.
Sure, long term CHERI might become available in real hardware, all languages should support it at that point. I guess C++ will need to do its homework for that, too: Accessing data out of bounds is UB there, so as soon as the compiler detects anything that accesses data out of bounds, it is free to change your entire program based on the assumption such an access never actually happens. That can seriously mess up your logic even when CHERI will catch such access at runtime. So even with CHERI, C++ is unsafe at this time:-(
Rusts runtime checking is overrated by the way: The programmer writes pretty similar code to C++ ranges. Just like in high level C++ code, the intention is very clear. Byarne claims such code runs faster than naive loops due to all the optimizations the compiler can do with all the context information it has. I am sure that if clang can figure this out, then rustc (using the same backend) will be able to do the same – and it has even more optimization potential in rust since mutable references can not alias there.
Replacing hardware is always more expensive than replacing software.
Why do you think that?
Note that these are not the same kinds of replacement: for software we are talking about rewriting from scratch, whereas for hardware we are talking about adding some new functionality. Hardware gets replaced every few years, but Windows and Mac OS are 30 and 35 years old, and both have outlived multiple hardware architectures.
Because you need to adapt the software to new hardware. Ideally that is just compiling it and running the test suite, but more often then not it involves more than that. Changing something fundamental as pointers will require at least some programs to be adapted in subtle ways. We saw that when 64bit intel CPUs became a thing: In theory recompiling the code was enough, in practice it took a long time to iron out all the bugs that suddenly were discovered.
In addition you have to replace the hardware, too.
It is a lot of work either way. I would expect the effort to be similar to porting well written code from one language into another language with similar concepts.
That’s is much more work than “just compile for the new hardware” and much less than “rewrite from scratch”.
You also need to be able to get the hardware for all the form factors you need: PCs are a tiny part of the systems a government will want to secure. They need to think about military equipment, the power grid, communications networks, air traffic control systems, server systems and basically anything in between.
I would expect the effort to be similar to porting well written code from one language into another language with similar concepts.
Practical experience from the work on CHERI over the last decade or so shows that it’s a lot easier than that, e.g. @david_chisnall cited a report saying porting to CHERI requires changing about 0.03% lines of code.
As much as I want to believe this is a genuine statement, it does feel like hype train riding to me.
The math just does not add up. Google invested crazy amount of resources into making C++ safer to code in. Code sanitisers, static analysis, lots of people working on LLVM.
Now all of a sudden a few weeks after the White House releases a statement about memory safety, Google decides to release a blog post about how they do not see a path for C++ to be safe. A move which I think has a positive impact on their stocks
Maybe I am wrong. Google did work on slowly integrating Rust into Android, maybe there are more internal discussions going on. But after this slow and careful approach this post seems rush to me.
Google is not a monolith. The security people at Google want more Rust. Various Google teams have been doing Rust stuff for ages. Many are still writing new C++ code. And of course, writing new code in Rust doesn’t negate the need they have to make their existing C++ code as safe as possible.
Rust uses LLVM (as do other potential C++ alternatives like Google’s carbon), so much of the investment can probably be salvaged. It is not as if C++ is going to vanish overnight either, so they need to secure that eco-system further anyway.
A move which I think has a positive impact on their stocks
If announcing “We will stop using C++” raises the stock price of a company, then all production C++ code will be replaced next year.
The math just does not add up. Google invested crazy amount of resources into making C++ safer to code in. Code sanitisers, static analysis, lots of people working on LLVM.
I’ve had some background chats during my time on the Rust project. Early on that was the argument of Google against Rust. “Fine for others, but we have our C++ down, we built a safety harness around it”.
They changed their opinion in the subsequent years.
Now all of a sudden a few weeks after the White House releases a statement about memory safety, Google decides to release a blog post about how they do not see a path for C++ to be safe. A move which I think has a positive impact on their stocks.
What math? The statistical math seems to check out.
two thirds of 0-day exploits detected in the wild used memory corruption vulnerabilities.
C++ is also very unlikely to ever become memory safe. Not without becoming a radically different language and jettisoning a large part of the current language. The committees designing it seem very much not committed to that. They want to have their cake and eat it too. Create a memory safe subset and call themselves memory safe. But that would require every codebase to shift to that and not use libraries that aren’t written in that subset. I am as close as you can get to certain that such a path will not succeed.
In fact memory corruption doesn’t even represent a quarter of vulns according to CISA. CISA, themselves were actually just owned. One of the (several) issues at play was a python flask app popen’ing a call to a perl script…
Hell, they have been publishing about how they have been ramping up Rust and its impact for years. And the Secure By Design CISA white paper is from early last year. This is not just a week after.
Google invested crazy amount of resources into making C++ safer to code in. Code sanitisers, static analysis, lots of people working on LLVM.
Given that they have spent 10s or 100s of millions of dollars on these projects, wouldn’t it make sense for them to look for cheaper solutions? Who could benefit more than Google, really?
in one case, Chrome was able to move its QR code generator out of a sandbox by adopting a new memory-safe library written in Rust, leading to both better security and better performance.
I’m all for seeing Rust code replace unsafe languages, but is the QR code generator really a good fit for Rust’s strengths? This is such a simple task that you could do it in any memory-safe language. And is there ever a scenario where you’d be exposing a QR code generator to untrusted input in the first place?
And is there ever a scenario where you’d be exposing a QR code generator to untrusted input in the first place?
If QR codes are generated as a way to share arbitrary URLs, I would consider the URLs untrusted input. If one needs to visit a page to generate a QR code for it, then the page might be more likely to attack Chrome in other ways than through its URL, but it’s still a possibility.
This is such a simple task that you could do it in any memory-safe language.
Who says it would be easier in those other memory-safe languages ? The core task might be easy to write, but integrating the language’s GC/runtime into a C++ project can pose significant problems (lifetime at the FFI boundary, runtime startup, build system, etc). Writing a QR scanner in Rust is just as straightforward, might be a bit more performant, and you get a simpler architecture in the end. It serves as a PoC for using Rust in trickier parts of Chrome.
I was just discussing this with some folks on WG21 the other night, and our shared conclusion was that it’s likely to be easier to retrofit memory safety into C than into C++. There is far less conflicting machinery in the language.
I haven’t started work on it yet, but I’m interested in trying to get a working prototype into the 9front compilers, and seeing where it can be used throughout the system – and then using that experience to get it into the standard.
It’s not at all strange. Does Google use C in any Google projects? And, as per the point of whether it’s relevant to Google, does Google dictate Linux development policy?
If you pay, as I pointed out, 55+ people (and there’s definitely more) to work in C for projects such as the cloud you run or the mobile operating system that you work on yes I would say that counts.
Let’s imagine I’m writing Python at my company. We hire Python developers. We write Python primarily. I would consider ourselves a “python shop”. One day we find a bug in CPython so we provide a patch. Are we also a C shop? Is CPython a project that we consider core to us as a company? Maybe, maybe not.
I think the other poster is simply claiming that just because Google invests in a technology written in C does not mean that Google has a “C codebase”. It’s also notable that 55 is an incredibly tiny number, given the number of engineers at Google. Of course, this is all predicated on a given person’s intuitive understanding of what it means for a company to “have” a codebase.
I think this is just a matter of differing opinions on what it means to “have” a codebase, or for a company to “use” a language.
Thank you for providing the first evidence shown here that Google owns a C codebase. I would upvote that, but I would also downvote the assumption that others are actively in denial (“mental gymnastics”) rather than merely uninformed or working from different definitions.
I am sure you will find some people at Google using pretty much any programming language in existence. They are still limited to a handful of languages they can use for official Google in-house code. That’s what that paper is about, it can obviously not regulate unrelated upstream projects not managed by Google at all. They will need to use whatever language upstream picked if they want to send code upstream.
In the case of the Linux kernel: At least one person pushing rust support forward is doing so on Google payroll.
C has even fewer safety features in the language (while arguably it’s simpler, it also has more sharp edges around memory management and a weaker type system).
C has bleaker future, because while C++ at least promises to do something, C is done and frozen in time, and resists any major new features.
C has much better interoperability with Rust, and even semantically accurate c2rust converter, so it’s easier to migrate away from C.
I’m really glad they’re calling this out. There is a lot of confusing messaging from WG21 members about the idea of adding different profiles to the language with different safety goals. But none of that has even tried to address the elephant in the room - temporal safety is a hard problem, and we don’t know how to solve it for C++.
The ideas we have are roughly:
In any case, there is something really important you’re leaving behind, which is likely to encounter a lot of resistance from actual C++ users.
Temporal safety is what I’m honestly most excited about with Rust. GC is efficient enough for many many use cases, but I’m not sure that shared-nothing concurrency is.
I don’t think you would loose compatibility. A codebase that works with CHERI should be able to be compiled to existing targets perfectly fine (expect for maybe some low-level routines). This seems like a pretty decent migration path as you will slowly gain safety as you upgrade hardware.
Of course there are some downsides:
But honestly Rust also does a lot of run-time checking and not rewriting all of your code seems a great upgrade path. To me this seems like the best path for C++ code. Adding better lifetime checking in the language is also great but it seems like any comprehensive system there will be a huge adoption cost. If you use CHERI to convert vulnerabilities into bugs then focus the language checking on reducing the number of bugs that seems a very practical solution.
The existing hardware does not support CHERI, so you do loose compatibility with that hardware.
Replacing hardware is always more expensive than replacing software. So if we argue that it would be prohibitively expensive to replace all C++ software with other software, we can surely not replace all the hardware running that software either.
Sure, long term CHERI might become available in real hardware, all languages should support it at that point. I guess C++ will need to do its homework for that, too: Accessing data out of bounds is UB there, so as soon as the compiler detects anything that accesses data out of bounds, it is free to change your entire program based on the assumption such an access never actually happens. That can seriously mess up your logic even when CHERI will catch such access at runtime. So even with CHERI, C++ is unsafe at this time:-(
Rusts runtime checking is overrated by the way: The programmer writes pretty similar code to C++ ranges. Just like in high level C++ code, the intention is very clear. Byarne claims such code runs faster than naive loops due to all the optimizations the compiler can do with all the context information it has. I am sure that if clang can figure this out, then rustc (using the same backend) will be able to do the same – and it has even more optimization potential in rust since mutable references can not alias there.
Why do you think that?
Note that these are not the same kinds of replacement: for software we are talking about rewriting from scratch, whereas for hardware we are talking about adding some new functionality. Hardware gets replaced every few years, but Windows and Mac OS are 30 and 35 years old, and both have outlived multiple hardware architectures.
Because you need to adapt the software to new hardware. Ideally that is just compiling it and running the test suite, but more often then not it involves more than that. Changing something fundamental as pointers will require at least some programs to be adapted in subtle ways. We saw that when 64bit intel CPUs became a thing: In theory recompiling the code was enough, in practice it took a long time to iron out all the bugs that suddenly were discovered.
In addition you have to replace the hardware, too.
Yes… and you are saying that is more expensive than rewriting the software from scratch?
It is a lot of work either way. I would expect the effort to be similar to porting well written code from one language into another language with similar concepts.
That’s is much more work than “just compile for the new hardware” and much less than “rewrite from scratch”.
You also need to be able to get the hardware for all the form factors you need: PCs are a tiny part of the systems a government will want to secure. They need to think about military equipment, the power grid, communications networks, air traffic control systems, server systems and basically anything in between.
Practical experience from the work on CHERI over the last decade or so shows that it’s a lot easier than that, e.g. @david_chisnall cited a report saying porting to CHERI requires changing about 0.03% lines of code.
In 10 years we went from “Rust will never replace C and C++” to “New C/C++ should not be written anymore, and you should use Rust”. Good job.
As much as I want to believe this is a genuine statement, it does feel like hype train riding to me.
The math just does not add up. Google invested crazy amount of resources into making C++ safer to code in. Code sanitisers, static analysis, lots of people working on LLVM.
Now all of a sudden a few weeks after the White House releases a statement about memory safety, Google decides to release a blog post about how they do not see a path for C++ to be safe. A move which I think has a positive impact on their stocks
Maybe I am wrong. Google did work on slowly integrating Rust into Android, maybe there are more internal discussions going on. But after this slow and careful approach this post seems rush to me.
Google is not a monolith. The security people at Google want more Rust. Various Google teams have been doing Rust stuff for ages. Many are still writing new C++ code. And of course, writing new code in Rust doesn’t negate the need they have to make their existing C++ code as safe as possible.
Rust uses LLVM (as do other potential C++ alternatives like Google’s carbon), so much of the investment can probably be salvaged. It is not as if C++ is going to vanish overnight either, so they need to secure that eco-system further anyway.
If announcing “We will stop using C++” raises the stock price of a company, then all production C++ code will be replaced next year.
I’ve had some background chats during my time on the Rust project. Early on that was the argument of Google against Rust. “Fine for others, but we have our C++ down, we built a safety harness around it”.
They changed their opinion in the subsequent years.
Google being fed up with the C++ standards committees also isn’t a secret, a good summary is in the Carbon codebase: https://github.com/carbon-language/carbon-lang/blob/trunk/docs/project/difficulties_improving_cpp.md
It’s the other way around. Google has been involved in those discussions very early on. That’s not even a secret, it goes back to the very early workshops at the white house in 2022: https://cpb-us-w2.wpmucdn.com/sites.gatech.edu/dist/a/2878/files/2022/10/OSSI-Final-Report.pdf
So, it’s more likely that they have been aware of the White House Press release and put out their own. This is concerted.
What math? The statistical math seems to check out.
C++ is also very unlikely to ever become memory safe. Not without becoming a radically different language and jettisoning a large part of the current language. The committees designing it seem very much not committed to that. They want to have their cake and eat it too. Create a memory safe subset and call themselves memory safe. But that would require every codebase to shift to that and not use libraries that aren’t written in that subset. I am as close as you can get to certain that such a path will not succeed.
That’s not true anymore: https://www.horizon3.ai/attack-research/attack-blogs/analysis-of-2023s-known-exploited-vulnerabilities/ .
In fact memory corruption doesn’t even represent a quarter of vulns according to CISA. CISA, themselves were actually just owned. One of the (several) issues at play was a python flask app popen’ing a call to a perl script…
Those aren’t even comparing the same things. The statistic was known 0 day vulnerabilities. Your link is known exploited vulnerabilities.
I don’t believe that that article actually contradicts the quoted statement.
Hell, they have been publishing about how they have been ramping up Rust and its impact for years. And the Secure By Design CISA white paper is from early last year. This is not just a week after.
Given that they have spent 10s or 100s of millions of dollars on these projects, wouldn’t it make sense for them to look for cheaper solutions? Who could benefit more than Google, really?
I’m all for seeing Rust code replace unsafe languages, but is the QR code generator really a good fit for Rust’s strengths? This is such a simple task that you could do it in any memory-safe language. And is there ever a scenario where you’d be exposing a QR code generator to untrusted input in the first place?
If QR codes are generated as a way to share arbitrary URLs, I would consider the URLs untrusted input. If one needs to visit a page to generate a QR code for it, then the page might be more likely to attack Chrome in other ways than through its URL, but it’s still a possibility.
Who says it would be easier in those other memory-safe languages ? The core task might be easy to write, but integrating the language’s GC/runtime into a C++ project can pose significant problems (lifetime at the FFI boundary, runtime startup, build system, etc). Writing a QR scanner in Rust is just as straightforward, might be a bit more performant, and you get a simpler architecture in the end. It serves as a PoC for using Rust in trickier parts of Chrome.
Then what about C?
I was just discussing this with some folks on WG21 the other night, and our shared conclusion was that it’s likely to be easier to retrofit memory safety into C than into C++. There is far less conflicting machinery in the language.
For example, here’s an experiment: http://thradams.com/cake/ownership.html#toc_1
I haven’t started work on it yet, but I’m interested in trying to get a working prototype into the 9front compilers, and seeing where it can be used throughout the system – and then using that experience to get it into the standard.
AFAIK Google does not use C, so it is entirely irrelevant for them.
There are 55 @google email addresses in the linux kernel maintainer list:
https://www.kernel.org/doc/linux/MAINTAINERS
Contributing C to something external written in C that you use internally is something different from “using C”, as an organisation.
Linux kernel policies prohibit other languages, so that’s the current status Google deals with.
This is a very strange statement to make.
What do you think Google Cloud runs on?
What do you think Android runs on?
It’s not at all strange. Does Google use C in any Google projects? And, as per the point of whether it’s relevant to Google, does Google dictate Linux development policy?
If you pay, as I pointed out, 55+ people (and there’s definitely more) to work in C for projects such as the cloud you run or the mobile operating system that you work on yes I would say that counts.
Not sure why this is even remotely controversial.
Let’s imagine I’m writing Python at my company. We hire Python developers. We write Python primarily. I would consider ourselves a “python shop”. One day we find a bug in CPython so we provide a patch. Are we also a C shop? Is CPython a project that we consider core to us as a company? Maybe, maybe not.
I think the other poster is simply claiming that just because Google invests in a technology written in C does not mean that Google has a “C codebase”. It’s also notable that 55 is an incredibly tiny number, given the number of engineers at Google. Of course, this is all predicated on a given person’s intuitive understanding of what it means for a company to “have” a codebase.
I think this is just a matter of differing opinions on what it means to “have” a codebase, or for a company to “use” a language.
https://github.com/GoogleCloudPlatform/compute-virtual-ethernet-linux <– This is a codebase written in C specifically for Google Cloud.
The mental gymnastics in this thread to avoid the fact that there are many C programmers at Google is quite astonishing.
Thank you for providing the first evidence shown here that Google owns a C codebase. I would upvote that, but I would also downvote the assumption that others are actively in denial (“mental gymnastics”) rather than merely uninformed or working from different definitions.
What mental gymnastics? I don’t care or have opinions on the matter, I was explaining their view point.
I am sure you will find some people at Google using pretty much any programming language in existence. They are still limited to a handful of languages they can use for official Google in-house code. That’s what that paper is about, it can obviously not regulate unrelated upstream projects not managed by Google at all. They will need to use whatever language upstream picked if they want to send code upstream.
In the case of the Linux kernel: At least one person pushing rust support forward is doing so on Google payroll.
Even more dead.
C has even fewer safety features in the language (while arguably it’s simpler, it also has more sharp edges around memory management and a weaker type system).
C has bleaker future, because while C++ at least promises to do something, C is done and frozen in time, and resists any major new features.
C has much better interoperability with Rust, and even semantically accurate c2rust converter, so it’s easier to migrate away from C.