1. 43
  1.  

  2. 14

    I love the combo of todo list for lay explanation of specific vulnerabilities, field data showing how prevalent they are, and following up with solutions on multiple paths. Lots of people in charge of technology aren’t programmers. They’ll understand the todo list. The field data would be pretty convincing for them. The gradual approach to adoption is realistic. This article might get some results out there.

    Only issue I noticed is that you don’t have a link to Alloy on the bottom of the page or About. People can’t lazily satisfy their curiosity about what interesting work you might be doing. Probably useful to have them on Firefox (for promotion) and U.S.D.S (for awareness), too.

    1. 3

      Lots of people in charge of technology aren’t programmers.

      We should fix that. Somehow.

      1. 2

        It’s your responsibility as an engineer to choose the appropriate tools. Assert this responsibility whenever someone else tries to choose for you.

        I really don’t see this enough, just people saying “my CTO chose Java, this sucks”

        1. 1

          What about admins?

      2. 7

        And despite all this the kernel of Google’s upcoming Fuschia is still good old C++! Well they do use unique_ptr in there so it’s a step to the right direction I guess. But realistically, what would be the alternative? Ada is too obscure, Go too high level, and Rust is maintained by Mozilla.

        1. 10

          And still, Fuchsia contains more Rust then C++.

          -------------------------------------------------------------------------------
           Language            Files        Lines         Code     Comments       Blanks
          -------------------------------------------------------------------------------
           C++                  6177      1429097      1069090       134155       225852
           Rust                 5262      1627636      1217153       261567       148916
          

          Maybe not in the kernel, where I can totally see why some years ago, you wouldn’t have picked Rust, but nevertheless. Rust has only been released for 4 years.

          Also, to be fair, quite a number of those are the crates in third_party (~800kloc), but as they vendor basically everything, it’s probably the same for C++.

          Quite a number of contributors work at Google.

          1. 6

            I had no idea they had so much Rust in there! I stand corrected.

            1. 0

              823067 of that is probably rustc + cargo.

              1. 1

                The Fuchsia source tree does not contain rustc (or it is at a place where I and tokei cannot find it?). rustc itself would be 900000 kloc. As written, it’s third party crates.

            2. 4

              The rust project is independently governed and I honestly don’t see any problem if it were maintained by Mozilla :>

              1. 3

                If past progress is a sign, they’ve already done better than C and C++ on this. So, let them keep at it. To be fair, the C++ standards have been making impressive progress, too.

                1. 1

                  We are one people, C++ programmers and Rust. I think of them as two sides of the same team.

                  1. 2

                    Yall really aren’t. C++ was built on C which is memory unsafe at the core. It values expressiveness over safety in many constructs. Rust, core on up, is willing to sacrifice expressiveness for safety. Most people are choosing it for that reason. Different teams or approaches. Some people pull for both, though.

            3. 6

              It is true that memory unsafety is real security threat for performance sensitive software written in C and C++, it’s partly why you see browsers like Firefox and Chrome switching to garbage collection for things that were previously managed manually, such as the DOM.

              I don’t think that languages like Rust and Swift as they stand are the answer. While it’s true that Firefox have switched to Rust for some of their internal components, if I remember correctly they have forms of tracing garbage collection tacked on where they enforce correct usage with a set of additional programming rules, which are checked with linting plugins.

              If you want complex graph-like data structures in standard Rust, you will soon find the single ownership model unsuitable. At this point, your choices are: i) try very hard to get things right with unsafe blocks, at which point you’re back in memory unsafety land or ii) make heavy use of reference counting containers, which have a significant performance overhead. In Swift the story is similar - reference counting is not cheap. In addition, to prevent leaks, care must be taken to ensure cycles are broken with weak reference placement, which in complex graphs is also non-trivial.

              It feels to me like an acceptable trade-off here would be some form of fast, opt-in tracing GC for managing complex data structures, where reasoning about liveness is hard and UAFs are more likely; and C++ style RAII for everything else. There is no silver bullet when it comes to memory, and as much as I’m sure lots of these large C / C++ projects would be relieved to reduce memory bugs, convincing them to switch to Rust or Swift as they currently stand may be impractical from a performance perspective.

              1. 16

                try very hard to get things right with unsafe blocks, at which point you’re back in memory unsafety land

                The point of Rust isn’t just about not using unsafe. The point of Rust is that if you need to use unsafe, then you should be able to button it up behind a safe API that can never be used in a way that results in memory unsafety. That is a striking improvement over languages that are “unsafe by default.”

                It feels to me like an acceptable trade-off here would be some form of fast, opt-in tracing GC for managing complex data structures, where reasoning about liveness is hard and UAFs are more likely; and C++ style RAII for everything else.

                This would be an interesting project, yes. Rust used to have garbage collected pointers (via the @ sigil), but dropped them in the move to 1.0. I wouldn’t be surprised if something like this came back in library form. I know some folks have worked on this in the past, but it’s hard.

                There is no silver bullet when it comes to memory, and as much as I’m sure lots of these large C / C++ projects would be relieved to reduce memory bugs, convincing them to switch to Rust or Swift as they currently stand may be impractical from a performance perspective.

                It’s a game of inches. Getting folks to recognize that memory safety is a problem in unsafe-by-default languages is step 1, and I don’t see any problem with that. There are still tons of folks that don’t think it is a problem.

                1. 3

                  While it’s true that Firefox have switched to Rust for some of their internal components, if I remember correctly they have forms of tracing garbage collection tacked on where they enforce correct usage with a set of additional programming rules, which are checked with linting plugins.

                  The only reason why that’s there is because, by specification fiat, they have to share the DOM with a JavaScript program.

                  1. 3

                    The only reason why that’s there is because, by specification fiat, they have to share the DOM with a JavaScript program.

                    To be fair, some Rust libraries have collection strategies. They are not traditional GC, but e.g. epoch based reclamation is sometimes used.

                    But that argument isn’t very interesting: not having a garbage collector in the language enables such techniques can be opted in or even kept local.

                2. 3

                  I am missing a discussion of the actual tradeoff: what about timing critical programs? What about dependencies / library coverage? Binary compatibility (e.g. kernel modules)? embedded platforms? This reads as a single-dimensional sales pitch and hardly even mentions the reasons why a memory-safe language might be unfeasible. The single axis here is that safety is good and unsafety is bad. IMHO too naive / not that interesting.

                  1. 1

                    There seems to be a belief amongst memory safety advocates that it is not one out of many ways in which software can fail, but the most critical ones in existance today, and that, if programmers can’t be convinced to switch languages, maybe management can be made to force them.

                    I didn’t see this kind of zeal when (for example) PHP software fell pray to SQL injections left and right, but I’m trying to understand it. The quoted statistics about found vulnerabilities seem unconvincing, and are just as likely to indicate that static analysis tools have made these kind of programming errors easy to find in existing codebases.

                    1. 19

                      Not all vulnerabilities are equal. I prioritize those that give attackers full control over my computer. They’re the worst. They can lead to every other problem. Plus, their rootkits or damage might not let you have it back. You can lose the physical property, too. Alex’s field evidence shows memory unsafety causes around 70-80% of this. So, worrying about hackers hitting native code, it’s rational to spend 70-80% of one’s effort eliminating memory unsafety.

                      More damning is that languages such as Go and D make it easy to write high-performance, maintainable code that’s also memory safe. Go is easier to learn with a huge ecosystem behind it, too. Ancient Java being 10-15x slower than C++ made for a good reason not to use it. Now, most apps are bloated/slow, the market uses them anyway, some safe languages are really lean/fast, using them brings those advantages, and so there’s little reason left for memory-unsafe languages. Even in intended use cases, one can often use a mix of memory-safe and -unsafe languages with unsafe used on performance-sensitive or lowest-level parts of the system. Moreover, safer languages such as Ada and Rust give you guarantees by default on much of that code allowing you to selectively turn them off only where necessary.

                      If using unsafe languages and having money, there’s also tools that automatically eliminate most of the memory unsafety bugs. That companies pulling in 8-9 digits still have piles of them show total negligence. Same with those in open-source development who aren’t doing much better. So, on that side of things, whatever tool you encourage should lead to memory safety even with apathetic, incompetent, or rushed developers working on code with complex interactions. Double true if it’s multi-threaded and/or distributed. Safe, orderly-by-default setup will prevent loads of inevitable problems.

                      1. 13

                        The quoted statistics about found vulnerabilities seem unconvincing

                        If studies by security teams at Microsoft and Google, and analysis of Apple’s software is not enough for you, then I don’t know what else could convince you.

                        These companies have huge incentives to prevent exploitable vulnerabilities in their software. They get the best developers they can, they are pouring many millions of dollars into preventing these kinds of bugs, and still regularly ship software with vulnerabilities caused by memory unsafety.

                        “Why bother with one class of bugs, if another class of bugs exists too” position is not conductive to writing secure software.

                        1. 3

                          “Why bother with one class of bugs, if another class of bugs exists too” position is not conductive to writing secure software.

                          No - but neither is pretending that you can eliminate a whole class of bugs for free. Memory safe languages are free of bugs caused by memory unsafety - but at what cost?

                          What other classes of bugs do they make more likely? What is the development cost? Or the runtime performance cost?

                          I don’t claim to have the answers but a study that did is the sort of thing that would convince me. Do you know of any published research like this?

                          1. 9

                            No - but neither is pretending that you can eliminate a whole class of bugs for free. Memory safe languages are free of bugs caused by memory unsafety - but at what cost?

                            What other classes of bugs do they make more likely? What is the development cost? Or the runtime performance cost?

                            The principle cost of memory safety in Rust, IMO, is that the set of valid programs is more heavily constrained. You often here this manifest as “fighting with the borrow checker.” This is definitely an impediment. I think a large portion of folks get past this stage, in the sense that “fighting the borrow checker” is, for the most part, a temporary hurdle. But there are undoubtedly certain classes of programs that Rust will make harder to write, even for Rust experts.

                            Like all trade offs, the hope is that the juice is worth the squeeze. That’s why there has been a lot of effort in making Rust easier to use, and a lot of effort put into returning good error messages.

                            I don’t claim to have the answers but a study that did is the sort of thing that would convince me. Do you know of any published research like this?

                            I’ve seen people ask this before, and my response is always, “what hypothetical study would actually convince you?” If you think about it, it is startlingly difficult to do such a study. There are many variables to control for, and I don’t see how to control for all of them.

                            IMO, the most effective way to show this is probably to reason about vulnerabilities due to memory safety in aggregate. But to do that, you need a large corpus of software written in Rust that is also widely used. But even this methodology is not without its flaws.

                            1. 2

                              If you think about it, it is startlingly difficult to do such a study. There are many variables to control for, and I don’t see how to control for all of them.

                              That’s true - but my comment was in response to one claiming that the bug surveys published by Microsoft et al should be convincing.

                              I could imagine something similar being done with large Rust code bases in a few years, perhaps.

                              I don’t have enough Rust experience to have a good intuition on this so the following is just an example. I have lots of C++ experience with large code bases that have been maintained over many years by large teams. I believe that C++ makes it harder to write correct software: not (just) because of memory safety issues, undefined behavior etc. but also because the language is so large, complex and surprising. It is possible to write good C++ but it is hard to maintain it over time. For that reason, I have usually promoted C rather than C++ where there has been a choice.

                              That was a bit long-winded but the point I was trying to make is that languages can encourage or discourage different classes of bugs. C and C++ have the same memory safety and undefined behavior issues but one is more likely than the other to engender other bugs.

                              It is possible that Rust is like C++, i.e. that its complexity encourages other bugs even as its borrow checker prevents memory safety bugs. (I am not now saying that is true, just raising the possibility.)

                              This sort of consideration does not seem to come up very often when people claim that Rust is obviously better than C for operating systems, for example. I would love to read an article that takes this sort of thing into account - written by someone with more relevant experience than me!

                              1. 7

                                I’ve been writing Rust for over 4 years (after more than a decade of C), and in my experience:

                                • For me Rust has completely eliminated memory unsafety bugs. I don’t even use debuggers or Valgrind any more, unless I’m integrating Rust with C.
                                • I used to have, at least during development, all kinds of bugs that spray the heap, corrupt some data somewhere, use uninitialized memory, use-after-free. Now I get compile-time errors or panics (which are safe, technically like C++ exceptions).
                                • I get fewer bugs overall. Lack of NULL and mandatory error handling are amazing for reliability.
                                • Built-in unit test framework, richer standard library and easy access to 3rd party dependencies help too (e.g. instead of hand-rolling another own buggy hash table, I use a well-tested well-optimized one).
                                • My Rust programs are much faster. Single-threaded Rust is 95% as fast as single-threaded C, but I can easily parallelize way more than I’d ever dare in C.

                                The costs:

                                • Rust’s compile times are not nice.
                                • It took me a while to become productive in Rust. “Getting” ownership requires unlearning C and a lot of practice. However, I’m not fighting the borrow checker any more, and I’m more productive in Rust thanks to higher-level abstractions (e.g. I can write map/reduce iterator that collects something into a btree — in 1 line).
                          2. 0

                            Of course older software, mostly written in memory-unsafe languages, sometimes written in a time when not every device was connected to a network, contains more known memory vulnerabilities. Especially when it’s maintained and audited by companies with excellent security teams.

                            These statistics don’t say much at all about the overall state of our software landscape. It doesn’t say anything about the relative quality of memory-unsafe codebases versus memory-safe codebases. It also doesn’t say anything about the relative sizes of memory-safe and memory-unsafe codebases on the internet.

                            1. 10

                              iOS and Android aren’t “older software”. They’ve been born to be networked, and supposedly secure, from the start.

                              Memory-safe codebases have 0% memory-unsafety vulnerabilities, so that is easily comparable. For example, check out the CVE database. Even within one project — Android — you can easily see whether the C or the Java layers are responsible for the vulnerabilities (spoiler: it’s C, by far). There’s a ton of data on all of this.

                              1. 2

                                Android is largely cobbled together from older software, as is IOS. I think Android still needs a Fortran compiler to build some dependencies.

                                1. 9

                                  That starts to look like a No True Scotsman. When real-world C codebases have vulnerabilities, they’re somehow not proper C codebases. Even when they’re part of flagship products of top software companies.

                                  1. 2

                                    I’m actually not arguing that good programmers are able to write memory-safe code in unsafe languages. I’m arguing vulnerabilities happen at all levels in programming, and that, while memory safety bugs are terrible, there are common classes of bugs in more widely used (and more importantly, more widely deployed languages), that make it just one class of bugs out of many.

                                    When XSS attacks became common, we didn’t implore VPs to abandon Javascript.

                                    We’d have reached some sort of conclusion earlier if you’d argued with the point I was making rather than with the point you wanted me to make.

                                    1. 4

                                      When XSS attacks became common, we didn’t implore VPs to abandon Javascript.

                                      Actually did. Sites/companies that solved XSS did so by banning generation of markup “by hand”, and instead mandated use of safe-by-default template engines (e.g. JSX). Same with SQL injection: years of saying “be careful, remember to escape” didn’t work, and “always use prepared statements” worked.

                                      These classes of bugs are prevalent only where developers think they’re not a problem (e.g. they’ve been always writing pure PHP, and will continue to write pure PHP forever, because there’s nothing wrong with it, apart from the XSS and SQLi, which are a force of nature and can’t be avoided).

                                      1. 1

                                        This kind of makes me think of someone hearing others talk about trying to lower the murder rate and then hysterically going into a rant about how murder is only one class of crime

                                        1. -1

                                          I think a better analogy is campaigning aggressively to ban automatic rifles when the vast majority of murders are committed using handguns.

                                          Yes, automatic rifles are terrible. But pointing them out as the main culprit behind the high murder rate is also incorrect.

                                          1. 4

                                            That analogy is really terrible and absolutely not fitting the context here. It’s also very skewed, the murder rate is not the reason for calls for bans.

                                      2. 2

                                        Although I mostly agree, I’ll note Android was originally built by a small business acquired by Google that continued to work on it probably with extra resources from Google. That makes me picture a move fast and break things kind of operation that was probably throwing pre-existing stuff together with their own as quickly as possible to get the job done (aka working phones, market share).

                                    2. 0

                                      Yes, if you zoom in on code bases written in memory-unsafe languages, you unsurprisingly get a large number of memory-unsafety vulnerabilities.

                                      1. 12

                                        And that’s exactly what illustrates “eliminates a class of bugs”. We’re not saying that we’ll end up in utopia. We just don’t need that class of bugs anymore.

                                        1. 1

                                          Correct, but the author is arguing that this is an exceptionally grievous class of security bugs, and (in another article) that developers’ judgement should not be trusted on this matter.

                                          Today, the vast majority of new code is written for a platform where execution of untrusted memory-safe code is a core feature, and the safety of that platform relies on a stack of sandboxes written mostly in C++ (browser) and Objective C/C++/C (system libraries and kernel)

                                          Replacing that stack completely is going to be a multi-decade effort, and the biggest players in the industry are just starting to dip their toes in memory-safe languages.

                                          What purpose does it serve to talk about this problem as if it were an urgent crisis?

                                          1. 11

                                            Replacing that stack completely is going to be a multi-decade effort, and the biggest players in the industry are just starting to dip their toes in memory-safe languages.

                                            Hm, so. Apple has developed Swift, which is generally considered a systems programming language, to replace Objective-C, which was their main programming language and already had safety features like baked in ARC. Google has implemented Go. Mozilla Rust. Google uses tons of Rust in Fuchsia and has recently imported the Rust compiler into the Android source tree.

                                            Microsoft has recently been blogging about Rust quite a lot and is often seen hanging around and blogs about how severe memory problems are to their safety story. Before that, Microsoft has spent tons of engineering effort into Haskell as a research base and C#/.Net as a replacement for their C/C++ APIs.

                                            Amazon has implemented firecracker in Rust and bragged about it on their AWS keynote.

                                            Come again about “dipping toes”? Yes, there’s huge amounts of stack around, but there’s also huge amounts to be written!

                                            What purpose does it serve to talk about this problem as if it were an urgent crisis?

                                            Because it’s always been a crisis and now we have the tech to fix it.

                                            P.S.: In case this felt a bit like bragging Rust over the others: it’s just where I’m most aware of things happening. Go and Swift are doing fine, I just don’t follow as much.

                                            1. 2

                                              The same argument was made for Java, which on top of its memory safety, was presented as a pry bar against the nearly complete market dominance of the Wintel platform at the time. Java evangelism managed to convert new programmers - and universities - to Java, but not the entire world.

                                              Oracle’s deadly embrace of Java didn’t move it to rewrite its main cash cow in Java.

                                              Rust evangelists should ask themselves why.

                                              I think that of all the memory-safe languages, Microsoft’s C++/CLI effort comes closest to understanding what needs to be done to entice coders to move their software into a memory-safe environment.

                                              At my day job, I actually try to spend my discretionary time trying to move our existing codebase to a memory-safe language. It’s mostly about moving the pieces into place so that green-field software can seamlessly communicate with our existing infrastructure. Then seeing what parts of our networking code can be replaced, slowly reinforcing the outer layers while the inner core remains memory unsafe.

                                              Delicate stuff, not something you want the VP of Engineering to issue edicts about. In the meantime, I’m still a C++ programmer, and I really don’t appreciate this kind of article painting a big target on my back.

                                              1. 4

                                                Java and Rust are vastly different ball parks for what you describe. And yet, Java is used successfully in the database world, so it is definitely to be considered. The whole search engine database world is full of Java stacks.

                                                Oracle didn’t rewrite its cashcow, because - yes, they are risk-averse and that’s reasonable. That’s no statement on the tech they write it in. But they did write tons of Java stacks around Oracle DB.

                                                It’s an argument on the level of “Why isn’t everything at Google Go now?” or “Why isn’t Apple using Swift for everything?”.

                                                1. 2

                                                  Looking at https://news.ycombinator.com/item?id=18442941 it seems that it was too late for a rewrite when Java matured.

                                              2. 8

                                                What purpose does it serve to talk about this problem as if it were an urgent crisis?

                                                To start the multi-decade effort now, and not spend more decades just saying that buffer overflows are fine, or that—despite of 40 years of evidence to the contrary—programmers can just avoid causing them.

                                    3. 9

                                      I didn’t see this kind of zeal when (for example) PHP software fell pray to SQL injections left and right

                                      You didn’t? SQL injections are still #1 in the OWASP top 10. PHP had to retrain an entire generation of engineers to use mysql_real_escape_string over vulnerable alternatives. I could go on…

                                      I think we have internalized arguments the SQL injection but have still not accepted memory safety arguments.

                                      1. 3

                                        I remember arguments being presented to other programmers. This article (and another one I remembered, which, as it turns out, is written by the same author: https://www.vice.com/en_us/article/a3mgxb/the-internet-has-a-huge-cc-problem-and-developers-dont-want-to-deal-with-it ) explicitly target the layperson.

                                        The articles use the language of whistleblowers. It suggests that counter-arguments are made in bad faith, that developers are trying to hide this ‘dirty secret’. Consider that C/C++ programmers skew older, have less rosy employment prospects, and that this article feeds nicely into the ageist prejudices already present in our industry.

                                        Arguments aimed at programmers, like this one at least acknowledge the counter-arguments, and frame the discussion as one of industry maturity, which I think is correct.

                                        1. 2

                                          I do not see it as bad faith. There are a non-zero number of people who say they can write memory safe C++ despite there being a massive amount of evidence that even the best programmers get tripped up by UB and threads.

                                          1. 1

                                            Consider that C/C++ programmers skew older, have less rosy employment prospects, and that this article feeds nicely into the ageist prejudices already present in our industry.

                                            There’s an argument to be made that the resurging interest in systems programming languages through Rust, Swift and Go futureproofs experience in those areas.

                                        2. 5

                                          Memory safety advocate here. It is the most pressing issue because it invokes undefined behavior. At that point, your program is entirely meaningless and might do anything. Security issues can still be introduced without memory unsafety of course, but you can at least reason about them, determine the scope of impact, etc.

                                        3. -1

                                          Memory safety is a good (as in safety) property. Duh!

                                          Let’s get rid of memory bugs!

                                          Let’s not forget another point that is at least as important: making software behave in the predicted way on all acceptable input, and reject all unacceptable input.

                                          Let’s not replace careful thinking with memory safety.

                                          Replacing all cars with self-driving bump cars that can go 130km/h / 80m/h merely shift the problem from all the drivers to the car makers.

                                          Replacing memory unsafe languages with Rust and Go shift the responsibility from the community of programmers to Mozilla and Google.

                                          And who does not love to take away the responsibility of killing people because of a memory bug… Let’s give that all to big companies, they sure will get it right.

                                          I do not think low-level tinkering for high level programs is beneficial to software engineering (look at the ex-vi source).

                                          But picking a fast high-level language should not sacrifice inner simplicity (different than easiness) of the tools we use (such as programming languages) so that the knowledge is not only held by the big companies (the web browser companies) that develop these languages.

                                          I am not talking about Go and Rust in particular and not criticizing the community either. I share my concerns. ;)

                                          1. 2

                                            Replacing memory unsafe languages with Rust and Go shift the responsibility from the community of programmers to Mozilla and Google.

                                            You’re right. Whenever I ask for a value to be borrowed from one scope to another via a raw pointer, the type checker must then send off an email to Chris Beard, CEO of Mozilla

                                            1. 1

                                              My point is: the counterpart to “looking for memory bugs in unsafe languages” is not “use an axe and chop heads off!”, but “looking for memory bugs in safe language’s implementation”.

                                              Sending email to CEO of Mozilla because some programmer chose to go “raw pointer style” using a programming language developed by Mozilla sounds a bit rough…

                                              1. 1

                                                Obviously, there might not be a new memory bug in programming languages implementation every morning. Good news!

                                            2. 1

                                              For instance, Go uses a Go-specific intermediate assembler language, that is to be translated into the other well-known assembly languages.

                                              That helps with understanding the language by splitting compilation into two steps.

                                              Rust favorizes doing all memory management “offline” / statically, which I guess simplifies the memory management down to compilation part.

                                              On top of documentation, programming requires either wild guesses or understanding the programming language intrisics at least a bit. So simplicity of the programming language helps with getting what is going on while programming.