One of my responsibilities at a previous job was running Coverity static analysis on a huge C codebase and following up on issues. It wouldn’t be uncommon to check a new library (not Curl) of 100k lines of code, and find 1000 memory issues. The vast majority would be pretty harmless – 1-byte buffer overflows and such – but then there were always some doozies that could easily lead to RCEs that we’d have to fix. At the point where the vast majority of people working in a language are messing up on a regular basis, it’s the language’s fault, not the people.
For anyone wondering about setting up static analysis for your own codebase, some things to know!
Static analysis, unlike dynamic analysis, is analysis performed on source code, and encompasses numerous individual analysis techniques. Some of them are control-flow based, some of them are looking for the presence of concerning patterns or use of unsafe functions. As an example, most engines doing checks for null-pointer dereferences are basing their analysis on a control flow graph of the program, looking for places where a potential null assignment could flow to a dereference.
Static analysis is a conservative analysis, meaning it may have false positives. The false positive rate can also be impacted by coding style. That said, most static analyzers are configurable, and you can and should (especially early on) filter and prioritize findings which your current coding practices may flag disproportionately.
Many static analysis tools will give you severity ratings for the findings. These are not the same as CVSS scores (Common Vulnerability Scoring System). They are based on a general expectation for a category of weakness a finding falls into (for example, they might say “CWE-476 (null pointer dereference) is often critical, so we’ll call anything that matches this CWE a ‘critical’ finding.’” These ratings say nothing about whether something is critical in your application, and you can have completely inconsequential findings rated critical, while actually critical vulnerabilities sit in the “low” category.
Additionally, understand that these categories, which are often given on a four-level scale in the popular tools, are defined along two axes: expected likelihood of exploitation and expected severity of exploitation. High likelihood + high severity = critical, low likelihood + high severity = high, high likelihood + low severity = medium, low likelihood + low severity = low.
The nature of the false positive rates means you’re much better off having a regular or continuous practice of static analysis during which you can tune configuration and burn down findings, to increase signal and value over time.
If you have too many findings to handle, you may be tempted to sample. If you do, use standard statistical techniques to make sure your sample size is large enough, and your sample is representative of the population. You may also consider whether the sample is representative of a subset of the population’s CWEs which you consider high priority (you may, for example, choose the regularly-updated CWE Top 25 list, or you may choose something like the OWASP top-10 mapped to CWE).
Hope this helps someone who is interested in setting up a software assurance practice using static analysis!
Conversations arguing over the counterfactual of whether using or not using C would have helped are less interesting than acknowledging, whatever language you’re using, there are software assurance techniques you start doing today to increase confidence in your code!
The thing you do to increase confidence doesn’t have to be changing languages. Changing languages can help (and obviously for anyone who’s read my other stuff, I am a big fan of Rust), but it’s also often a big step (I recommend doing it incrementally, or in new components rather than rewrites). So do other stuff! Do static analysis! Strengthen your linting! Do dynamic analysis! More testing! Formal methods for the things that need it! Really, just start.
Excellent point. Coverity is a really, really good tool; I do wish there were an open-source equivalent so more people could learn about static analysis.
Here’s what I find interesting: 42 of those “C mistakes” (which of course are a minority of the real bugs) are actually range errors. Then the other 9 are other. That’s like 82%.
You don’t have to go too far from C to catch those. Like D (yes, I gotta get on the language evangelism bandwagon too!) just bundled the length and pointer into the syntax sugar of type[]and then the compiler inserts automatic bounds checks at runtime. Most C functions do a pointer and length anyway so this approach is easy and makes a pretty big improvement.
I think other C descendants do something similar. I guess putting that in C itself as an extension would be tricky though since while there’s usually a length around, it isn’t necessarily easy to see which pointer it ties to.
This mindset of replacing a language to remove a class of errors is naive at best.
I hate null with a passion and I do think Rust memory safety is a valuable feature. But lets take the biggest problem of that class as an example, the heartbleed bug. If you look at the vulnerability code, it is a very basic mistake. If you took an introductory course in C, you would learn how not to do that.
To argue that it was just a matter of using a language that doesn’t allow for that kind of error is the solution is to defend an impossible solution. Without doubting the good intentions of whomever wrote that piece of code, let us call a spade a spade, it was objectively poor code with basic flaws.
You don’t solve bad engineering by throwing a hack at it such as changing the language. It will manifest itself in the form of other classes of bugs and there is no evidence whatsoever that the outcome isn’t actually worse than the problem one is trying to fix.
Java doesn’t allow one to reference data by its memory address, precisely to avoid this whole class of problems, why isn’t everyone raving about how that magically solved all problems? The answer is: because it obviously didn’t.
I love curl and use it intensively, but this post goes down that whole mindset. Running scripts to find bugs and so on.
I’m not convinced by this argument. Large C and C++ projects seem to always have loads of memory vulns. Either they’re not caused by bad programming or bad programming is inevitable.
I think the core question of whether memory unsafe languages result in more vulnerable code can probably be answered with data. The only review I’m aware of is this fairly short one by a Rust contributor, but there are probably others: https://alexgaynor.net/2020/may/27/science-on-memory-unsafety-and-security/
Until you have the evidence, don’t bother with hypothetical notions that someone can write 10 million lines of C without ubiquitious memory-unsafety vulnerabilities – it’s just Flat Earth Theory for software engineers.
There should be a corollary: until you have the evidence, don’t bother with hypothetical notions that rewriting 10 million lines of C in another language would fix more bugs than it introduces.
Agreed. But nuance is deserved on “both sides” of the argument.
It’s fair to say that rewriting 10 million lines of C in a memory safe language will, in fact, fix more memory bugs than it introduces (because it fix them all and wont introduce any).
It’s also fair to acknowledge that memory bugs are not the only security bugs and that security bugs aren’t the only important bugs.
It’s not fair to say that it’s literally impossible for a C program to ever be totally secure.
My tentative conclusion is this: If your C program is not nominally related to security, itself, then it very likely will become more secure by rewriting in Rust/Zig/Go/whatever. In other words, if there are no crypto or security algorithms implemented in your project, then the only real source of security issues is from C, itself (or your dependencies, of course).
If you C program is related to security in purpose, as in sudo, a crypto library, password manager, etc, then the answer is a lot less clear. Many venerable C projects have the advantage of time- they’ve been around forever and have lots of battle testing. It’s likely that if they stay stable and don’t have a lot of code churn that they wont introduce many new security bugs over time.
if there are no crypto or security algorithms implemented in your project, then the only real source of security issues is from C, itself
I don’t think this is true. All sorts of programs accept untrusted input, not just crypto or security projects, and almost any code that handles untrusted input will have all sorts of opportunities to be unsafe, regardless of implementation language.
Theoretically yes. But, in practice, if you’re not just passing a user-provided query string into your database, it’s much, MUCH, harder for bad input to pose a security threat. What’s the worst they can do- type such a long string that you OOM? I can be pretty confident that no matter what they type, it’s not going to start writing to arbitrary parts of my process’s memory.
It’s not just databases, tho, it’s any templating or code generation that uses untrusted input.
Do you generate printf format strings, filesystem paths, URLs, HTML, db queries, shell commands, markdown, yaml, config files, etc? If so, you can have escaping issues.
And then there are problems specific to memory unsafety: buffer overturns let you write arbitrary instructions to process memory, etc.
Did you forget that my original comment was specifically claiming that you should not use C because of buffer overruns? So that’s not a counter-point to my comment at all- it’s an argument for it.
My overall assertion was that if you’re writing a program in C, it will almost definitely become more secure if you rewrote it in a memory-safe language, with the exception of programs that are about security things- those programs might already have hard-won wisdom that you’d be giving up in a rewrite, so the trade-off is less clear.
I made a remark that if your C program doesn’t, itself, do “security stuff”, that the only security issues will be from the choice of C. That’s not really correct, as you pointed out- you can surely do something very stupid like passing a user-provided query right to your database, or connect to a user-provided URL, or whatever.
But if that’s the bar we’re setting, then that program definitely has no business being written in C (really at all, but still). There’s certainly no way it’s going to become less secure with a rewrite in a memory-safe language.
Your argument is essentially a form of “victim shaming” where we slap the programmers and tell them to be better and more careful engineers next time.
It is an escapism that stiffles progress by conveniently opting to blame the person making the mistake, rather than the surrounding tools and environment that either enabled, or failed to prevent the error.
It can be applied to all sorts of other contexts including things such as car safety. You could stop making cars safer and just blame the drivers for not paying more attention, going too fast, drink driving, etc…
If we can improve our tools of the trade to reduce or - better yet - eliminate the possibility of mistakes and errors we should do it. If it takes another whole language to do it then so be it.
That’s similar to a car manufacturer using a different engine or chassis because somehow it reduces the accidents because of the properties that it has.
The way we can make that progress is exactly by blaming our “tools” as the “mistake enablers”. Not the person using the tools. Usually they’ve done their best in good faith to avoid a mistake. If they have still made one, that’s an opportunity for improvement of our tools.
Your argument is essentially “you can’t prevent bad engineering or silly programmer errors with technical means; this is a human problem that should be fixed at the human level”. I think this is the wrong way to look at it.
I think it’s all about the programmer’s mental bandwidth; humans are wonderful, intricate, and beautiful biological machines. But in spite of this we’re also pretty flawed and error-prone. Ask someone to do the exact same non-trivial thing every Wednesday afternoon for a year and chances are a large amount of them will fail at least once to follow the instructions exactly. Usually this is okay because most things in life have fairly comfortable error margins and the consequences of failure are non-existent or very small, but for some things it’s a bit different.
This is why checklists are used extensively in aviation; it’s not because the pilots are dumb or inexperienced, it’s because it’s just so damn easy to forget something when dealing with these complex systems, and the margin for error is fairly low if you’re 2km up in the sky and the consequences can be very severe.
C imposes fairly high mental bandwidth: there are a lot of things you need to do “the right way” or you run in to problems. I don’t think anyone is immune to forgetting something on occasion; who knows what happened with the Heartbleed thing; perhaps the programmer got distracted for a few seconds because the cat jumped on the desk, or maybe their spouse asked what they wanted for dinner tonight, or maybe they were in a bad mood that day, or maybe they … just forgot.
Very few people are in top form all day, every day. And if you write code every day then sooner or later you will make a mistake. Maybe it’s only once every five years, but if you’re working on something like OpenSSL the “make a silly mistake once every five years” won’t really cut it, just as it won’t for pilots.
The code is now finished and moves on to the reviewer(s); and the more they need to keep in mind when checking the code the more chance there is they may miss something. Reviewing code is something I already find quite hard even with “easy” languages: how can I be sure that it’s “correct”? Doing a proper review takes almost as much time as writing the code itself (or longer!) The more you need to review/check for every line of code, the bigger the chance is that you’ll miss a mistake like this.
I don’t think that memory safety is some sort of panacea, or that it’s a fix for sloppy programming. But it makes it frees up mental bandwidth and mistakes will be harder and their consequences less severe. It’s just one thing you don’t have to think about, and now you have more space to think about other aspects of the program (including security problems not related to memory safety).
@x64k mentioned PHP in another reply, and this suffers from the same problem; I’ve seen critical security fixes which consist of changing in_array($list, $item) to in_array($list, $item, true). That last parameters enable strict type checking (so that "1" == 1 is false). The root cause of these issues is the same as in C: it imposes too much bandwidth to get it right, every time, all the time.
NULLs have the same issue: you need to think “can this be NULL?” every time. It’s not a hard question, but it’s sooner or later you’ll get it wrong and asking it all the time takes up a lot of bandwidth probably best spent elsewhere.
That was a long time ago so lots of people don’t remember what OP is talking about anymore. The claim wasn’t that Java would magically solve all memory problems. That was back when the whole “scripting vs. systems” language dichotomy was all the rage and everyone thought everything would be written in TCL, Scheme or whatever in ten years or so. There was a more or less general expectation (read: lots of marketing material, since Java was commercially-backed, but certainly no shortage of independent tech evangelists) that, without pointers, all problems would go away – no more security issues, no more crashes and so on.
Unsurprisingly, neither of those happened, and Java software turned out to be crash-prone in its own unpleasant ways (there was a joke about how you can close a Java program if you can’t find the quit button: wiggle the mouse around, it’ll eventually throw an unhandled exception) in addition to good ol’ language-agnostic programmer error.
If you took an introductory course in C, you would learn how not to do that.
Yet somehow the cURL person/people made the mistake. Things slip by.
Java doesn’t allow one to reference data by its memory address, precisely to avoid this whole class of problems, why isn’t everyone raving about how that magically solved all problems? The answer is: because it obviously didn’t.
That, actually, was one of the the biggest selling points of Java to C++ devs. It’s probably the biggest reason that Java is still such a dominant language today.
I also take issue with your whole message. You say that you can’t fix bad engineering by throwing a new language at it. But that’s an over generalization of the arguments being made. You literally can fix bad memory engineering by using a language that doesn’t allow it, whether that’s Java or Rust. In the meantime, you offer no solution other than “don’t do this thing that history has shown is effectively unavoidable in any sufficiently large and long-lived C program”. So what do you suggest instead? Or are we just going to wait for heartbleed 2.0 and act surprised that it happened yet again in a C program?
Further, you throw out a complaint that we can’t prove that rewriting in Rust (or whatever) won’t make things worse than they currently are. We live in the real world- you can’t prove lots of things, but is there any reason to actually suspect that this is realistically possible?
This mindset of replacing a language to remove a class of errors is naive at best.
I’d rather say that your post is, charitably, naive at best (and it continuing to dominate the conversation is an unfortunate consequence of the removal of the ability to flag down egregiously incorrect posts, sadly).
I hate null with a passion and I do think Rust memory safety is a valuable feature. But lets take the biggest problem of that class as an example, the heartbleed bug. If you look at the vulnerability code, it is a very basic mistake. If you took an introductory course in C, you would learn how not to do that.
Do you really believe that the OpenSSL programmers (whatever else you can say about that project) lack an introductory knowledge of C? Do you feel the linux kernel devs, who have made identical mistakes, similarly lack an introductory knowledge of C? Nginx devs? Apache? Etc, etc.
This is an extraordinary claim.
You don’t solve bad engineering by throwing a hack at it such as changing the language.
Probably one of the most successful fields in history at profoundly reducing, and keeping low, error rates has been the Aviation industry, and the lesson of their success is that you don’t solve human errors by insisting that the people who made the errors would have known not to if they’d just taken an introductory course they’d already long covered, or in general just be more perfect.
The Aviation industry realized that humans, no matter how well tutored and disciplined and focused, inevitably will still make mistakes, and that the only thing that reduces errors is looking at the mistakes that are made and then changing the system to account for those mistakes and reduce or eliminate their ability to recur.
When your dogma leads to you making extraordinary (indeed, ludicrous) claims, and the historical evidence points the polar opposite of the attitude you’re preaching being the successful approach, it’s past time to start reconsidering your premise.
The Aviation industry realized that humans, no matter how well tutored and disciplined and focused, inevitably will still make mistakes, and that the only thing that reduces errors is looking at the mistakes that are made and then changing the system to account for those mistakes and reduce or eliminate their ability to recur.
I’d like to stress that this is only one part of Aviation’s approach, at least as driven by the FAA and the NTSB in the US. The FAA also strives to create a culture of safety, by mandating investigations into incidents, requiring regular medical checkups depending on your pilot rating, releasing accident findings often, incentivizing record-keeping on both aircraft (maintenance books) and pilots (logbooks), encouraging pilots to share anonymous information on incidents that occurred with minor aircraft or no passenger impact, and many more. This isn’t as simple as tweaking the system. It’s about prioritizing safety at every step of the conversation.
Dropping the discussion of “the problem is human nature” in this comment. I’m explicitly not rhetorically commenting on it or implying such.
These “other parts”, and culture of safety - how would we translate that across into programming? Actually, come to think of it that’s probably not the first question. The first question is, is it possible to translate that across into programming?
I think it’s fair to say that in e.g. webdev people flat-out just value developer velocity over aerospace levels of safety because (I presume) faster development is simply more valuable in webdev than it is in aerospace - if the thing crashes every tuesday you’ll lose money, but you won’t lose that much money. So, maybe it’s impractical to construct such a culture. Maybe. I don’t know.
But, supposing it is practical, what are we talking about? Record-keeping sounds like encouraging people to blog about minor accidents, I guess? But people posting blogs is useless if you don’t have some social structure for discussing the stuff, and I’m not sure what the analogous social structure would be here.
“Prioritizing safety at every step of the conversation” sounds like being able to say no to your boss without worry.
“This isn’t as simple as tweaking the system” sounds like you’re saying “treat this seriously and stop chronicly underserving it both financially and politically”, which sounds to me like “aim for the high-hanging fruit of potential problems”, which I don’t think anyone with the word “monetize” job description will ever remotely consider.
What are the low-hanging fruit options in this “stop excessively focusing on low-hanging fruit options” mindset you speak of?
Actually, it sounds like that sort of thing would need sort of government intervention in IT security or massive consumer backlash. Or more likely both, with the latter causing the former.
The first question is, is it possible to translate that across into programming?
It most certainly is. The “easiest” place to see evidence of this is to look into fields of high-reliability computing. Computing for power plants, aviation, medical devices, or space are all good examples. A step down would be cloud providers that do their best to provide high availability guarantees. These providers also spend a lot of engineering effort + processes in emphasizing reliability.
But, supposing it is practical, what are we talking about? Record-keeping sounds like encouraging people to blog about minor accidents, I guess? But people posting blogs is useless if you don’t have some social structure for discussing the stuff, and I’m not sure what the analogous social structure would be here.
Thing of the postmortem process posted on the blogs of the big cloud providers. This is a lot like the accident reports that the NTSB releases after accident investigation. I think outside of the context of a single entity coding for a unified goal (whether that’s an affiliation of friends, a co-op, or a company), it’s tough to create a “culture” of any sort, because in different contexts of computing, different tradeoffs are desired. After all, I doubt you need a high reliability process to write a simple script.
“This isn’t as simple as tweaking the system” sounds like you’re saying “treat this seriously and stop chronicly underserving it both financially and politically”, which sounds to me like “aim for the high-hanging fruit of potential problems”, which I don’t think anyone with the word “monetize” job description will ever remotely consider.
You’d be surprised how many organizations, both monetizing and not, have this issue. Processes become ossified; change is hard. Aiming for high-hanging fruit is expensive. But a mix of long-term thinking and short-term thinking is always the key to making good decisions, and in computing it’s no different. You have to push for change if you’re pushing against a current trend of unsafety.
What are the low-hanging fruit options in this “stop excessively focusing on low-hanging fruit options” mindset you speak of?
There needs to be a feedback mechanism between failure of the system and engineers creating the system. Once that feedback is in place, safety can be prioritized over time. Or at least, this is one way I’ve seen this done. There are probably many paths out there.
I think it’s fair to say that in e.g. webdev people flat-out just value developer velocity over aerospace levels of safety because (I presume) faster development is simply more valuable in webdev than it is in aerospace - if the thing crashes every tuesday you’ll lose money, but you won’t lose that much money. So, maybe it’s impractical to construct such a culture. Maybe. I don’t know.
This here is the core problem. Honestly, there’s no reason to hold most software to a very high standard. If you’re writing code to scrape the weather from time to time from some online API and push it to a billboard, meh. What software needs to do is get a lot better about prioritizing safety in the applications that require it (and yes, that will require some debate in the community to come up with applications that require this safety, and yes there will probably be different schools of thought as there always are). I feel that security is a minimum, but beyond that, it’s all application specific. Perhaps the thing software needs the most now is just pedagogy on operating and coding with safety in mind.
A step down would be cloud providers that do their best to provide high availability guarantees. These providers also spend a lot of engineering effort + processes in emphasizing reliability.
Google’s SRE program and the SRE book being published for free are poster examples of promoting a culture of software reliability.
You don’t solve bad engineering by throwing a hack at it such as changing the language.
Yes, you absolutely do. One thing you can rely on is that humans will make mistakes. Even if they are the best, even if you pay them the most, even if you ride their ass 24 hours a day. Languages that make certain kinds of common mistakes uncommon or impossible save us from ourselves. All other things being equal, you’d be a fool not to choose a safer language.
I wrote a crypto library in C. It’s small, only 2K lines of code. I’ve been very diligent every step of the way (save one, which I paid dearly). I reviewed the code several times over. There was even an external audit. And very recently, I’ve basically fixed dead code. Copying a whopping 1KB, allocating and wiping a whole buffer, wasting lines of code, for no benefit whatsoever. Objectively a poor piece of code with a basic flaw.
I’m very careful with my library, and overall I’m very proud of its overall quality; but sometimes I’m just tired.
(As for why it wasn’t noticed: as bad as it was, the old code was correct, so it didn’t trigger any error.)
I’m half joking here but, indeed, if language-level memory safety were all it takes for secure software to happen, we could have been saved ages ago. We didn’t have to wait for Go, or Rust, or Zig to pop up. A memory-safe language with no null pointers, where buffer overflow, double-frees and use-after-free bugs are impossible, has been available for more than 20 years now, and the security track record of applications written in that language is a very useful lesson. That language is PHP.
I’m not arguing that (re)writing curl in Go or Rust wouldn’t eventually lead to a program with fewer vulnerabilities in this particular class, I’m just arguing that “this program is written in C and therefore not trustworthy because C is an unsafe language” is, at best, silly. PHP 4 was safer than Rust and boy do I not want to go back to dealing with PHP 4 applications.
Now of course one may argue that, just like half of curl’s vulnerabilities are C mistakes, half of those vulnerabilities were PHP 4 mistakes. But in that case, it seems a little unwise to wager that, ten years from now, we won’t have any “half of X vulnerabilities are Rust mistakes” blog posts…
Language-level anything isn’t all it takes, but from my experience they do help and they help much more than “a little”, and… I’ll split this in two.
The thing I’ve done that found the largest number of bugs ever was when I once wrote a script to look for methods (in a >100kloc code base) that that three properties: a) Each method accepted at least one pointer parameter b) contained null in the code and c) did not mention null in the documentation for that method. Did that find all null-related errors? Far from it, and there were several false positives for each bug, and many of the bugs weren’t serious, but I used the output to fix many bugs in just a couple of days.
Did this fix all bugs related to null pointers? No, not even nearly. Could I have found and fixed them in other ways? Yes, I could. The other ways would have been slower, though. The script (or let’s call it a query) augmented my capability, in much the same way as many modern techniques augment programmers.
And this brings me to the second part.
We have many techniques that do do nothing capable programmers can’t do. (I’ve written assembly language without any written specification, other documentation, unit tests or dedicated testers, and the code ran in production and worked. It can be done.)
That doesn’t mean that these techniques are superfluous. Capable programmers are short of time and attention; techniques that use CPU cycles, RAM and files, and that save brain time are generally a net gain.
That includes safe languages, but also things like linting, code queries, unit tests, writing documentation and fuzzing (or other white-noise tests). I’d say it also includes code review, which can be described as using other ream members’ attention to reduce the total attention needed to deliver features/fix bugs.
Saying “this program is safe because it has been fuzzed” or “because it uses unit tests” doersn’t make sense. But “this program is unsafe because it does not use anything more than programmer brains” makes sense and is at least a reasonable starting assumption.
(The example I used above was a code query. A customer reported a bug, I hacked together a code query to find similar possible trouble spots, and found many. select functions from code where …)
PHP – like Go, Rust and many others out there – also doesn’t use anything more than programmer brains to avoid off-by-one errors, for example, which is one of the most common causes of bugs with or without security implications. Yet nobody rushes to claim that programs written in one of these languages are inherently unsafe because they rely on nothing but programmer brains to find such bugs.
As I mentioned above: I’m not saying these things don’t matter, of course they do. But conflating memory safety with software security or reliability is a bad idea. There’s tons of memory-safe code out there that has so many CVEs it’s not even funny.
But conflating memory safety with software security or reliability is a bad idea. There’s tons of memory-safe code out there that has so many CVEs it’s not even funny.
Who is doing this? The title of the OP is explicitly not conflating memory safety with software security. Like, can you find anyone with any kind of credibility conflating these things? Are there actually credible people saying, “curl would not have any CVEs if it were written in a memory safe language”?
It is absolutely amazing to me how often this straw man comes up.
EDIT: I use the word “credible” because you can probably find a person somewhere on the Internet making comments that support almost any kind of position, no matter how ridiculous. So “credible” in this context might mean, “an author or maintainer of software the other people actually use.” Or similarish. But I do mean “credible” in a broad sense. It doesn’t have to be some kind of authority. Basically, someone with some kind of stake in the game.
Just a few days ago there was a story on the lobster.rs front page whose author’s chief complaint about Linux was that its security was “not ideal”, the first reason for that being that “Linux is written in C, [which] makes security bugs rather common and, more importantly, means that a bug in one part of the code can impact any other part of the code. Nothing is secure unless everything is secure.” (Edit: which, to be clear, was in specific contrast to some Wayland compositor being written in Rust).
Yeah, I’m tired of it, too. I like and use Rust but I really dislike the “evangelism taskforce” aspect of its community.
I suppose “nothing is secure unless everything is secure” is probably conflating things. But saying that C makes security bugs more common doesn’t necessarily. In any case, is this person credible? Are they writing software that other people use?
I guess I just don’t understand why people spend so much time attacking a straw man. (Do you even agree that it is a straw man?) If someone made this conflation in a Rust space, for example, folks would be very quick to correct them that Rust doesn’t solve all security problems. Rust’s thesis is that it reduces them. Sometimes people get confused either because they don’t understand or because none are so enthusiastic as the newly converted. But I can’t remember anyone with credibility making this conflation.
Like, sure, if you see someone conflating memory safety with all types of security vulnerabilities, then absolutely point it out. But I don’t think it makes sense to talk about that conflation as a general phenomenon that is driving any sort of action. Instead, what’s driving that action is the thesis that many security vulnerabilities are indeed related to memory safety problems, and that using tools which reduce those problems in turn can eventually lead to more secure software. While some people disagree with that, it takes a much more nuance argument and it sounds a lot less ridiculous than the straw man you’re tearing down.
Yeah, I’m tired of it, too. I like and use Rust but I really dislike the “evangelism taskforce” aspect of its community.
I’m more tired of people complaining about the “evangelism taskforce.” I see a lot more of that than I do the RESF.
Sorry, I think I should have made the context more obvious. I mean, let me start with this one, because I’d also like to clarify that a) I think Rust is good and b) that, as far as this particular debate is concerned, I think writing new things in Rust rather than C or especially C++ is a good idea in almost every case:
(Do you even agree that it is a straw man?)
What, that experienced software developers who know and understand Rust are effectively claiming that Rust is magic security/reliability dust? Oh yeah, I absolutely agree that it’s bollocks, I’ve seen very few people who know Rust and have more than a few years of real-life development experience in a commercial setting making that claim with a straight face. There are exceptions but that’s true of every technology.
But when it comes to the strike force part, here’s the thing:
If someone made this conflation in a Rust space, for example, folks would be very quick to correct them that Rust doesn’t solve all security problems.
…on the other hand, for a few years now it feels like outside Rust spaces, you can barely mention an OS kernel or a linker or a window manager or (just from a few days ago!) a sound daemon without someone showing up saying ugh, C, yeah, this is completely insecure, I wouldn’t touch it with a ten-foot pole. Most of the time it’s at least plausible, but sometimes it’s outright ridiculous – you see the “not written in Rust” complaint stuck on software that has to run on platforms Rust doesn’t even support, or that was started ten years ago and so on.
Most of them aren’t credible by your own standards or mine, of course, but they’re part of the Rust community whether they’re representative of the “authoritative” claims made by the Rust developers or not.
…on the other hand, for a few years now it feels like outside Rust spaces, you can barely mention an OS kernel or a linker or a window manager or (just from a few days ago!) a sound daemon without someone showing up saying ugh, C, yeah, this is completely insecure, I wouldn’t touch it with a ten-foot pole. Most of the time it’s at least plausible, but sometimes it’s outright ridiculous – you see the “not written in Rust” complaint stuck on software that has to run on platforms Rust doesn’t even support, or that was started ten years ago and so on.
As I mentioned in my below comment on this article, this is a good thing. I want people who decide to write a novel sound daemon in C to see those sorts of comments, and (ideally) rethink the decision to write a novel C program to begin with. Again, this doesn’t necessarily imply that Rust is the right choice of language for any given project, but it’s a strong contender right now.
Even now though, there still is significant tension between “don’t use C” and “make it portable”. Especially if you’re targetting embedded, or unknown platforms. C is still king of the hill as far as portability goes.
What we really want is dethrone C at its own game: make something that eventually becomes even more portable. That’s possible: we could target C as a backend, and we could formally specify the language so it’s clear what’s a compiler bug (not to mention the possibility of writing formally verified compilers). Rust isn’t there yet.
One of the (many) reasons I don’t use C and use Rust instead is because it’s easier to write portable programs. I believe every Rust program I’ve written also works on Windows, and that has nearly come for free. Certainly a lot cheaper than if I had written it in C. I suppose people use “portable” to mean different things, but without qualification, your dichotomy doesn’t actually seem like a dichotomy. I suppose the dichotomy is more, “don’t use C” and “make it portable to niche platforms”?
I think we (as in both me and the parent poster) were talking about different kinds of portability. One of the many reasons why most of the software I work on is (still) in C rather than Rust is that, while every Rust program I’ve written works on Windows, lots of the ones I need have to work on architectures that are, at best, Tier 2. Proposing that we ship something compiled with a toolchain that’s only “guaranteed to build” would at best get me laughed at.
Yes. The point I’m making is that using the word “portable” unqualified is missing the fact that Rust lets you target one of the most popular platforms in the world at a considerably lower cost in lots of common cases. It makes the trade off being made more slanted than what folks probably intend by holding up “portability” as a ubiquitously good thing. Well, if we’re going to do that, we should acknowledge that there is a very large world beyond POSIX and embedded, and that world is primarily Windows.
For the record, if I were writing desktop GUI applications or games, of course the only relevant platforms are Windows, Linux, and MacOSX. Or Android and iOS, if the application is meant for palmtops. From there “portability” just means I chose middleware that have backends for all the platforms I care about. Rust, with its expanded standard library, does have an edge.
If however I’m writing a widely applicable library (like a crypto library), then Rust suddenly don’t look so good any more. Because I know for a fact that many people still work on platforms that Rust doesn’t support yet. Not to mention the build dependency on Rust itself. So either I still use C, and I have more reach, or I use Rust, and I have more safety (not by much if I test my C code correctly).
Well, if we’re going to do that, we should acknowledge that there is a very large world beyond POSIX and embedded, and that world is primarily Windows.
Of course, my C library is also going to target Windows. Not doing so would defeat the point.
I don’t think I strongly disagree with anything here. It’s just when folks say things like this
C is still king of the hill as far as portability goes.
I would say, “welllll I’m not so sure about that, because I can reach a lot more people with less effort using Rust than I can with C.” Because if I use C, I now need to write my own compatibility layer between my application and the OS in order to support a particularly popular platform: Windows.
And of course, this depends on your target audience, the problem you’re solving and oodles of other things, as you point out. But there’s a bit of nuance here because of how general the word “portable” is.
Yeah, I was really talking about I/O free libraries. I believe programs should be organised in 3 layers:
At the bottom, you have I/O free libraries, that depend on nothing but the compiler and maybe the standard library. That lack of dependency can make them extremely portable, and easy to integrate to existing projects. The lack of I/O makes them easy to test, so they have the potential to be very reliable, even if they’re written in an unsafe language.
In the middle, you have, the I/O compatibility layer. SDL, Qt, Libuv, Rust’s stdlib, even hand written, take your pick. That one cannot possibly be portable, because it has to depend on the quirks of the underlying platform. But it can have several backends, which make the users of this compatibility layer quite portable.
At the top, you have the application, that depends on the I/O free library and the compatibility layer. It cannot target platforms the compatibility layers doesn’t target, but at least is should be fairly small (maybe 10 times smaller than the I/O free libraries?), so if a rewrite is needed it shouldn’t be that daunting.
I believe C is still a strong contender for the bottom layer. There specifically, it is still the king of portability. For the middleware and the top layer however, the portability of the language means almost nothing, so it’s much harder to defend using C there.
Also note that the I/O free libraries can easily be application specific, and not intended for wider distribution. In that case, C also loses its edge, as (i) portability matters much less, and (ii) it’s still easier to use a safe language than write a properly paranoid test suite.
I was talking about “runs on a 16-bit micro controller as well as an 64-bit monster”. The kind where you might not have any I/O, or even a heap allocator. The kind where you avoid undefined behaviour and unspecified behaviour and implementation defined behaviour.
Hard, but possible for some programs. Crypto libraries (without the RNG) for instance are pure computation, and can conform to that highly restricted setting. I’ll even go a bit further: I think over 95% of programs can be pure computation, and be fully separated from the rest (I/O, system calls, networking and all that).
If you want to print stuff on a terminal, portability drops. If you want to talk to the network or draw pixels on the screen, portability in C is flat out impossible, because the required capabilities aren’t in the standard library. I hear Rust fares far better in that department.
I suppose the dichotomy is more, “don’t use C” and “make it portable to niche platforms”?
Open BSD has tier 3 support for Rust, which basically means no support. You assess how “niche” OpenBSD really is, especially in a security context.
Yes, I would absolutely say OpenBSD is a niche platform. I generally don’t care if the stuff I write works on OpenBSD. I just don’t. I care a lot more that it runs on Windows though. If more people used OpenBSD, then I’d care more. That’s the only reason I care about Windows. It’s where the people are.
Niche doesn’t mean unimportant.
To pop up a level, I was making a very narrow point on a particular choice of wording. Namely, that “Rust isn’t as portable as C” is glossing over some really significant nuance depending on what you’re trying to do. If all you’re trying to do is distribute a CLI application, then Rust might not let you target as many platforms as easily as C, but it might let you reach more people with a lot less effort.
I want people who decide to write a novel sound daemon in C to see those sorts of comments, and (ideally) rethink the decision to write a novel C program to begin with.
What in the world makes you think they haven’t considered that question and concluded that C, for all its shortcomings, was nonetheless their best option?
stuck on software that has to run on platforms Rust doesn’t even support
Porting Rust to a platform sounds more achievable than writing correct software in C, so the only thing ridiculous is that people think “I haven’t ported it” is a valid excuse.
Lots of people. Search for “heartbleed C” or “heartbleed memory safety” or “heartbleed rust”.
Are there actually credible people saying, “curl would not have any CVEs if it were written in a memory safe language”?
They are not credible to me if they make such absurd claims, but they exist in very large numbers. They won’t claim that all problems would go away, but they all point out that heartbleed woulnd’t happen if openssl was written in Rust (for example). Yes, there are hundreds of such claims on the web. Thousands probably. As if a basic error like the one that led to heartbleed could only take the form of a memory safety problem.
As for credibility. I don’t find your definition very useful. There is a lot of software used by millions, much of it genuinely useful that is still badly engineered. I don’t think popularity is a good indicator for credibility.
Can you actually show me someone who is claiming that all security vulnerabilities will be fixed by using Rust or some other memory safe language that would meet your standard of credibility if it weren’t for that statement itself?
I tried your search queries and I found nobody saying or implying something like, “using a memory safe language will prevent all CVEs.”
but they all point out that heartbleed woulnd’t happen if openssl was written in Rust (for example)
This is a very specific claim though. For the sake of argument, if someone were wrong about that specific case, that doesn’t mean they are conflating memory safety for all security vulnerabilities. That’s what I’m responding to.
As for credibility. I don’t find your definition very useful. There is a lot of software used by millions, much of it genuinely useful that is still badly engineered. I don’t think popularity is a good indicator for credibility.
So propose a new one? Sheesh. Dispense with the pointless nitpicking and move the discussion forward. My definition doesn’t require something to be popular. I think I was pretty clear in my comment what I was trying to achieve by using the word “credible.” Especially after my edit. I even explicitly said that I was trying to use it in a very broad sense. So if you want to disagree, fine. Then propose something better. Unless you have no standard of credibility. In which case, I suppose we’re at an impasse.
I tried your search queries and I found nobody saying or implying something like, “using a memory safe language will prevent all CVEs.”
I never made such claim. You are insisting in the whole “prevent all CVEs”. That is an extreme point that I never made, nor did any other people in this thread. If you take to that extreme, then sure you are right. I never claimed that people say that Rust will magically do their laundry either. Please let’s keep the discussion to a level of reasonability so it stays fruitful.
FWIW, for “heartbleed rust”, google returns this in the first page:
Would the Cloudbleed have been prevented if Rust was used
How to Prevent the next Heartbleed
Would Rust have prevented Heartbleed? Another look
All these are absurd. It is not my point to shame or blame. I have no idea who the author of the heartbleed offending code is. And in all honest we all have made mistakes. But let’s not take relativism to the absurd. Let’s be clear, it was objectively very poorly written code with a trivial error. A seasoned engineer should look at that and immediately see the problem. If you think that level of quality is less problematic when you use a ‘safer’ language, you are in for very bad surprises. It was objectively bad engineering, nothing less. The language had nothing to do with that. A problem with the same severity would have the same probability to occur in Rust, it would just take other form. The claims on the titles I quoted from my google are silly.
If you jump from a plane without a parachute you also prevent the whole class of accidents that happen when the screen opens up. I am sure people understand that that is a silly claim.
This is a very specific claim though. For the sake of argument, if someone were wrong about that specific case, that doesn’t mean they are conflating memory safety for all security vulnerabilities. That’s what I’m responding to.
Again, no one is claiming that rust community is conflating memory safety with “all safety vulns”. I have no clue where you got that from.
But to the point, it is as specific as it is pointless, as is the parachute example.
I didn’t say you did. But that’s the claim I’m responding to.
You are insisting in the whole “prevent all CVEs”.
I didn’t, no. I was responding to it by pointing out that it’s a straw man. It sounds like you agree. Which was my central point.
nor did any other people in this thread
No, they did:
But conflating memory safety with software security or reliability is a bad idea. There’s tons of memory-safe code out there that has so many CVEs it’s not even funny.
The rest of your comment is just so far removed from any reality that I know that I don’t even know how to engage with it. It sounds like you’re in the “humans just need to be better” camp. I’m sure you’ve heard the various arguments about why that’s not a particularly productive position to take. I don’t have any new arguments to present.
Specifically, I (sort of) claimed that and expanded upon it here. And I’m just going to re-emphasise what I said in that comment.
I see from your profile that you’re a member of the Rust library team – I imagine most of the interactions you have within the Rust community are with people who are actively involved in building the Rust environment. That is, people who have the expertise (both with Rust and other aspects software development), the skill, and the free time to make substantial contributions to a game-changing technology, and who are therefore extremely unlikely to claim anything of that sort.
So I understand why this looks like a straw man argument to you – but this is not the Rust-related interaction that many people have. I was gonna say “most” but who knows, maybe I just got dealt a bad hand.
Most of “us” (who don’t know/use Rust or who, like me, don’t use it that much) know it via the armchair engineering crowd that sits on the sides, sneers at software like the Linux kernel and casually dismisses it as insecure just for being written in C, with the obvious undertone that writing it in Rust would make it secure. Like this or like this or like this.
They’re not dismissing it as memory unsafe and the undertone isn’t that (re)writing it in Rust would plug the memory safety holes. When they propose that some 25 year-old piece of software be rewritten in Rust today, the idea, really, is that if you start today, you’ll have something that’s more secure, whatever the means, in like one or two years.
That’s why there are people who want RIIR flags. Not to avoid useful discussion with knowledgeable members of the Rust community like you, but to avoid dismissive comments from the well ackshually crowd who thinks about something for all of thirty seconds and then knows exactly where and why someone is wrong about a project they’ve been working on for five years.
I imagine most of the interactions you have within the Rust community are with people who are actively involved in building the Rust environment.
Not necessarily. It depends on the day. I am also on the moderation team. So I tend to get a narrow view on library matters and a very broad view on everything else. But I also frequent r/rust (not an “official” Rust space), in addition to HN and Lobsters.
That is, people who have the expertise (both with Rust and other aspects software development), the skill, and the free time to make substantial contributions to a game-changing technology, and who are therefore extremely unlikely to claim anything of that sort.
Certainly. I am under no illusion about that. I don’t think you or anyone was saying “core Rust engineers have made ridiculous claim Foo.” That’s why I was asking for more data. I wanted to hear about any credible person who was making those claims.
FWIW, you didn’t just do this with Rust. You kinda did it with Java too in another comment:
There was a more or less general expectation (read: lots of marketing material, since Java was commercially-backed, but certainly no shortage of independent tech evangelists) that, without pointers, all problems would go away – no more security issues, no more crashes and so on.
I mean, like, really? All problems? I might give you that there were maybe some marketing materials that, by virtue of omission, gave that impression that Java solved “all problems.” But, a “general expectation”? I was around back then too, and I don’t remember anything resembling that.
But, like, Java did solve some problems. It came with some of its own, not all of which were as deeply explored as they are today.
See, the thing is, when you say hyperbolic things like this, it makes your side of the argument a lot easier to make. Because making this sort of argument paints the opposing side as patently ridiculous, and this in turn removes the need to address the nuance in these arguments.
So I understand why this looks like a straw man argument to you – but this is not the Rust-related interaction that many people have. I was gonna say “most” but who knows, maybe I just got dealt a bad hand.
Again. If you see someone with a misconception like this—they no doubt exist—then kindly point it out. But talking about it as a sort of general phenomenon just seems so misguided to me. Unless it really is a general phenomenon, in which case, I’d expect to be able to observe at least someone building software that others use on the premise that switching to Rust will fix all of their security problems. Instead, what we see are folks like the curl author making a very careful analysis of the trade offs involved here. With data.
RE https://news.ycombinator.com/threads?id=xvilka: I can’t tell if they’re a troll, but they definitely post low effort comments. I’d downvote most of them if I saw them. I downvote almost any comment that is entirely, “Why didn’t you write it in X language?” Regretably, it can be a legitimate question for beginners to ask, since a beginner’s view of the world is just so narrow, nearly by definition.
I note that the first and third links you gave were downvoted quite a bit. So that seems like the system is working. And that there aren’t hordes of people secretly in favor of comments like that and upvoting them.
FWIW, I don’t recognize any of these people as Rust community members. Or rather, I don’t recognize their handles. And as a moderator, I am at least passively aware of pretty much anyone that frequents Rust spaces. Because I have to skim a lot of content.
I’m not sure why we have descended into an RESF debate. For every RESF comment you show, I could show you another anti-RESF Lobsters’ comment.
It’s just amazing to me that folks cannot distinguish between the zeal of the newly converted and the actual substance of the idea itself. Like, look at this comment in this very thread. Calling this post “RIIR spam,” even though it’s clearly not.
They’re not dismissing it as memory unsafe and the undertone isn’t that (re)writing it in Rust would plug the memory safety holes. When they propose that some 25 year-old piece of software be rewritten in Rust today, the idea, really, is that if you start today, you’ll have something that’s more secure, whatever the means, in like one or two years.
But that’s very different than what you said before. It doesn’t necessarily conflate memory safety with security. That’s a more nuanced representation of the argument and it is much harder to easily knock down (if at all). It’s at least true enough that multiple groups of people with a financial stake have made a bet on that being true. A reasonable interpretation of “more secure” is “using Rust will fix most or nearly all of the security vulnerabilities that we have as a result of memory unsafety.” Can using Rust also introduce new security vulnerabilities unrelated to memory safety by virtue of the rewrite? Absolutely. But whether this is true or not, and to what extent, really depends on a number of nuanced factors.
That’s why there are people who want RIIR flags. Not to avoid useful discussion with knowledgeable members of the Rust community like you, but to avoid dismissive comments from the well ackshually crowd who thinks about something for all of thirty seconds and then knows exactly where and why someone is wrong about a project they’ve been working on for five years.
The “well ackshually” crowd exists pretty much everywhere. Rust perhaps has a higher concentration of them right now because it’s still new. But they’re always going to be around. I’ve been downvoting and arguing with the “well ackshually” crowd for years before I even knew what Rust was.
If you see a dismissive comment that isn’t contributing to the discussion, regardless of whether it’s about RIIR or not, flag it. I have absolutely no problem with folks pointing out low effort RESF comments that express blind enthusiasm for a technology. Trade offs should always be accounted for. My problem is that the anti-RESF crowd is not doing just that. They are also using it as a bludgeon against almost anything that involves switching to Rust. This very post, by Curl’s author, is not some low effort RESF bullshit. (I should say that not all RESF bullshit is trolling. I’ve come across a few folks that are just new to the world of programming. So they just don’t know how to see the nuance in things yet, even if it’s explicitly stated. There’s only so much novel signal a brain can take in at any point. Unfortunately, it’s difficult to differentiate between sincere but misguided beginners and trolls. Maybe there are people other than trolls and beginners posting RESF bullshit, but I don’t actually know who they are.)
Either way, if we get RIIR flags, then we should get anti-RIIR flags. See where that leads? Nowhere good. Because people can’t seem to differentiate between flippant comments and substance.
Sorry I got a bit ranty, but this whole thread is just bush league IMO.
Every once in a while someone suggests to me that curl and libcurl would do better if rewritten in a “safe language”. Rust is one such alternative language commonly suggested. This happens especially often when we publish new security vulnerabilities.
I try to keep away from these bush league threads myself but, uh, sometimes you just go with it, and this was one of those cases precisely because of that context.
I’ve been slowly trying to nudge people into using Rust on embedded systems ever since I gave up on Ada, so I’m not trying to dismiss it, I have quite an active interest in it. Yet I’ve been at the receiving end of “you should rewrite that in a safe language” many times, too, like most people writing firmware. And I don’t mean on lobster.rs (which has a level-headed audience, mostly :P), I mean IRL, too. Nine times out of ten these discussions are bullshit.
That’s because nine times out of ten they’re not carried out with people who are really knowledgeable about Rust and firmware development. E.g. I get long lectures about how it’ll vastly improve firmware reliability by eliminating double frees and dangling pointers. When I try to point out that this is true in general, and that Rust’s memory model is generally helpful in embedded systems (e.g. the borrow checker is great!) but this particular problem is a non-issue because this is an embedded system and all allocations are static and we never get a double free because we don’t even malloc! I get long lectures about how two years from now everything will be AArch64 anyway and memory space won’t be an issue.
(Edit: to be clear – I definitely don’t support “RIIR” flags or anything of the sort, and indeed “the system” works, as in, when one of the RIIR trolls pop up, they get downvoted into oblivion, whether they’re deliberately trolling or just don’t know better. I’m just trying to explain where some of the negativity comes from, and why in my personal experience it’s often try to hold back on it even when you actually like Rust and want to use it more!)
I mean, like, really? All problems? I might give you that there were maybe some marketing materials that, by virtue of omission, gave that impression that Java solved “all problems.” But, a “general expectation”? I was around back then too, and I don’t remember anything resembling that.
Oh, yeah, that was my first exposure to hype, and it gave me a healthy dose of skepticism towards tech publications. I got to witness that as a part of the (budding, in my part of the world) tech journalism scene (and then, to some degree, through my first programming gigs). The cycle went basically as follows:
There were lots of talks and articles and books on Java between ’95 and ’97-‘98 (that was somewhat before my time but that’s the material I learned Java from later) that always opened with two things: it’s super portable (JVM!) and there are no pointers, so Java programs are less likely to crash due to bad memory accesses and are less likely to have security problems.
These were completely level-headed and obviously correct. Experienced programmers got it and even those who didn’t use Java nonetheless emulated some of the good ideas in their own environments.
Then 2-4 years later we got hit by all the interns and lovely USENET flamers who’d grown up on stories they didn’t really understand about Java and didn’t really qualify these statements.
So then I spent about two years politely fending off suggestions about articles on how net appliances and thin clients are using Java because it’s more secure and stable, on why everyone is moving to Java and C++ will only be used for legacy applications and so on – largely because I really didn’t understand these things well enough, but man am I glad I let modesty get the better of me. Most of my colleagues didn’t budge, either, but there was a period during which I read a “Java programs don’t crash” article every month because at least one – otherwise quite respectable – magazine would publish one.
Aye. Thanks for sharing. I totally get your perspective. Talking about your projects with folks only to have to get into an exhausting debate that you’ve had umpteen times already is frustrating. Happens to me all the time too, for things outside of Rust. So I know the feeling. It happens in the form of, “why didn’t you do X instead of Y?” Depending on how its phrased, it can feel like a low brow dismissal. The problem is that that line of questioning is also a really great way of getting a better understanding of the thing you’re looking at in a way that fits into your own mental model of the world. Like for example, at work I use Elasticsearch. If I see a new project use SOLR, I’m going to be naturally curious as to why they chose it over Elasticsearch. I don’t give two poops about either one personally, but maybe they have some insight into the two that I don’t have, and updating my mental model would be nice. The problem is that asking the obvious question comes across as a dismissal. It’s unfortunate. (Of course, sometimes it is a dismissal. It’s not always asked in good faith. Sometimes it’s coupled with a healthy dose of snobbery, and those folks can just fuck right off.)
It’s harder to do IRL, but the technique I’ve adopted is that when someone asks me questions like that, I put about as much effort into the response as they did the question. If they’re earnest and just trying to understand, then my hope is that they might ask more follow up questions, and then it might become a nice teaching moment. But most of the time, it’s not.
I guess I would just re-iterate that my main issue with the anti-RIIR crowd is that it’s overbroad. If it were just some groaning about trolls, then fine. But it’s brought up pretty much any time Rust is brought up, even if bringing Rust up is appropriate.
But I suppose that’s the state of the Internet these days. Tribes are everywhere and culture wars can’t be stopped.
And in all honest we all have made mistakes. But let’s not take relativism to the absurd. Let’s be clear, it was objectively very poorly written code with a trivial error. A seasoned engineer should look at that and immediately see the problem. If you think that level of quality is less problematic when you use a ‘safer’ language, you are in for very bad surprises. It was objectively bad engineering, nothing less. The language had nothing to do with that. A problem with the same severity would have the same probability to occur in Rust, it would just take other form.
What form would it take? Would it be in the “all private keys in use on the internet can be leaked”-form? Probably not, I think?
Anyway, let’s forget about security for a minute; why wouldn’t you want the computer to automate memory management for you? Our entire job as programmers is automate things and the more things are automated, the better as it’s less work for us. This is why we write programs and scripts in the first place.
Traditionally automated memory management has come with some trade-offs (i.e. runtime performance hits due to GCs) and Rust attempts to find a solution which automates things without these drawbacks. This seems like a good idea to me, because it’s just more convenient: I want the computer to do as much work for me as possible; that’s its job.
Back to security: if I have a door that could be securely locked by just pressing a button vs. a door that can be securely locked by some complicated procedure, then the first door would be more secure as it’s easier to use. Sooner or later people will invariable make some mistake in the second door’s procedure. Does that mean the first door guarantees security? No, of course not. You might forget to close the window, or you might even forget to press that button. But it sure reduces the number of things you need to do for a secure locking, and the chances you get it right are higher.
Kinda sorta. PHP is/was largely written in C itself and IIRC had its share of memory related security bugs from just feeding naughty data to PHP standard library functions.
I don’t know what PHP is like today from that point of view.
So, I take issue when you say:
I’m just arguing that “this program is written in C and therefore not trustworthy because C is an unsafe language” is, at best, silly. PHP 4 was safer than Rust and boy do I not want to go back to dealing with PHP 4 applications.
I still think it’s completely justifiable to be skeptical of a program written in C. Just because another language may also be bad/insecure/whatever does not invalidate the statement or sentiment that C is a dangerous language that honestly brings almost nothing to the table in the modern era.
Interestingly, there’s a different language from a very similar time as C, that has more safety features, and its name is Pascal. And there was a time when they were kinda competing, as far as I understood. This was maybe especially at the time of Turbo C and Turbo Pascal, and then also Delphi. Somehow C won, with the main argument being I believe “because performance”, at least that’s how I remember it. My impression is that quite often, when faced with a performance vs. security choice, “the market” chooses performance over security. I don’t have hard data as to whether code written in Pascal was more secure than that in C; I’d be curious to see some comparison like that. I seem to have a purely anecdotal memory, than when I felt some software was remarkably stable, it tended to show up to be written in Pascal. Obviously it was still totally possible to write programs with bugs in Pascal; I think Delphi code had some characteristic kind of error messages that I saw often enough to learn to recognize them. Notably, it also actually still required manual memory management - but I believe it was better guarded against buffer overruns etc. than C.
I thought the reasons for C’s wide adoption were Unix and the university system. I.E., the universities were turning out grads who knew Unix and C. I’ve only heard good things about the performance of Turbo Pascal.
Pascal is safer, but it was certainly possible to write buggy Pascal. Back in the early 90s I hung out on bulletin boards and played a lot of Trade Wars 2002. That was written in Turbo Pascal, and it had a few notable and widely exploited bugs over the years. One such was a signed overflow of a 16-bit integer. I won a couple Trade Wars games by exploiting those kinds of bugs.
This cartoon deserves an entire blog post! But I’ll just list out the gymnastic routines of statically-typed language proponents:
The type system doesn’t line up with our imagined type theory, so we must work in a Platonic ideal theory which doesn’t actually reflect computational behavior, and philosophically justify why this is acceptable
Oh! I don’t do this to look smart. I do this because I see in you the same tribalism that I once had, and I only grew past it because of evidence like the links I shared with you. I’m not trying to say that static typing is wrong; I’m trying to expand and enrich your knowledge about type theory. Once you’ve reached a certain altitude and vantage, then you’ll see that static and dynamic typing are not tribes which live in opposition, but ways of looking at the universal behaviors of computation.
Please, read and reflect. Otherwise this entire thread was off-topic: Your first post is not a reply to its parent, but a tangent that allowed you to display your tribal affiliation. I don’t mind being off-topic as long as it provides a chance to improve discourse.
The (popular) mistake you are making is that you pretend things are equal when they are not, just like shitting your pants (untyped) and trying to not shit your pants (typed) are not positions with similar merit.
No those are all serious refutations of why statically-typed languages is a panacea and is a lot more insightful than a silly comic someone made to find affirmation among their Twitter followers.
Here are the only economically viable solutions I see to the problem of “too much core infrastructure has recurring, exploitable memory unsafety bugs” problem (mainly because of C):
Gradually update the code with annotations – something like Checked C. Or even some palatable subset/enhancement of C++.
Distros evolve into sandboxing based on the principle of least privilege (DJB style, reusing Chrome/Mozilla sandboxes, etc.) – I think this is one of the most economically viable solutions. You still have memory unsafety but the resulting exploits are prevented (there has been research quantifying this)
The product side of the industry somehow makes a drastic shift and people don’t use kernels and browsers anymore (very unlikely, as even the “mobile revolution” reused a ton of infrastructure from 10, 20, 30, 40 years ago, both on the iOS and Android side)
(leaving out hardware-based solutions here since I think hardware changes slower than software)
(I have looked at some of the C to Rust translators, and based on my own experience with translating code and manually rewriting it, I’m not optimistic about that approach. The target language has to be designed with translation in mind, or you get a big mess.)
Manually rewriting code in Rust or any other language is NOT on that list. It would be nice but I think it’s a fantasy. There’s simply too much code, and too few people to rewrite it.
Moreover with code like bash, to a first approximation there’s 1 person who understands it well enough to rewrite it (and even that person doesn’t really understand his own code from years ago, and that’s about what we should expect, given the situation).
Also, the most infamous bash vulnerability (ShellShock) was not related to memory unsafety at all. Memory unsafety is really a subset of the problem with core infrastructure.
Sometime around 2020 I made a claim that in 2030 the majority of your kernel and your browser will still be in C or C++ (not to mention most of your phone’s low level stack, etc.).
Knowing what the incentives are and the level of resources devoted to the problem, I think we’re still on track for that.
I’m honestly interested if anyone would take the opposite side: we can migrate more than 50% of our critical common infrastructure by 2030.
This says nothing about new projects written in Rust of course. For core infrastructure, the memory safety + lack of GC could make it a great choice. But to a large degree we’ll still be using old code. Software and especially low level infrastructure has really severe network effects.
I agree with you. It’s just not going to happen that we actually replace most of the C code that is out there. My hope, however, is that C (and C++) becomes the next COBOL in the sense that it still exists, there are still people paid to work on systems written in it, but nobody is starting new projects in it.
Along the same lines as your first bullet point, I think a big step forward- and the best “bang for our buck” will be people doing fuzz testing on all of these old C projects. There was just recently a story making the rounds here and on HN about some bug in… sudo? maybe? that lead to a revelation of the fact that the commit on the project that caused the regression was a bugfix to a bug for which there was no test, before or after the change. So, not only did the change introduce a bug that wasn’t caught by a test that didn’t exist- we can’t even be sure that it really fixed the issue it claimed to, or that we really understood the issue, or whatever.
My point is that these projects probably “should” be rewritten in Rust or Zig or whatever. But there’s much lower hanging fruit. Just throw these code bases through some sanitizers, fuzzers, whatever.
Yeah that’s basically what OSS Fuzz has been doing since 2016. Basically throwing a little money at projects to integrate continuous fuzzing. I haven’t heard many updates on it but in principle it seems like the right thing.
which doesn’t look particularly active, and doesn’t seem to show any results (?).
Rather than talk about curl and “RIIR” (which isn’t going to happen soon even if the maintainer wants it to), it would be better to talk about if curl is doing everything it can along the other lines.
Someone mentioned that curl is the epitome of bad 90’s C code, and I’ve seen a lot of that myself. There is a lot of diversity in the quality of C code out there, and often the sloppiest C projects have a lot of users.
Prominent examples are bash, PHP, Apache, etc. They code fast and sloppy and are responsive to their users.
There’s a fundamental economic problem that a lot of these discussion are missing. Possible/impossible or feasible/infeasible is one thing; whether it will actually happen is a different story.
Bottom line is that I think there should be more talk about projects along these lines, more talk about sandboxing and principle of least privilege, and less talk about “RIIR”.
Manually rewriting code in Rust or any other language is NOT on that list.
Nah. This kinda assumes that the rewrite/replacement/whatever would happen due to technical reasons.
It certainly wouldn’t. If a kernel/library/application gets replaced by a safer implementation, it’s for business reasons, where it just happens that the replacement is written in e. g. Rust.
So yes, I fully expect that a certain amount of rewrites to happen, just not for the reasons you think.
I mean, there is no silver bullet, right? We can all agree with that? So, therefore, “just apply more sagacious thinking” isn’t going to fix anything just as “switch to <Rust|D|C#|&C>” won’t? The focus on tools seems to miss the truth that this is a human factors problem, and a technological solution isn’t going to actually work.
One of curl’s main advantages is its ubiquity; I can run it on an OpenWRT router, a 32bit ARMv7 OpenBSD machine, a POWER9 machine, and even Illumos//OpenIndiana. It’s a universal toolkit. It also runs in extremely constrained and underpowered environments.
Do you know of a memory-safe language that fits the bill (portability and a tiny footprint)? Rust fails on the former and Java fails on the latter. Go might work (gccgo and cgo combined have a lot of targets and TinyGo can work in constrained environments), but nowhere as well as C.
When somebody pretends that C is fine for security-critical software because “anybody as smart as me doesn’t write ${x} bugs”, there’s no point arguing with them. Their logic is correct, the problem is that their assumptions are wrong; nobody is as good at programming as they think they are.
I don’t disagree that curl would be better off in rust, but curl is really a shining example of 90s super bloaty/legitimately terrible C code. Any rewrite would eliminate half its vulns (probably more), including a rewrite in C.
Bloat:
> wc -l src/**.c src/**.h lib/**.c lib/**.h
159299 total
More code = more bugs and 160k loc to do not much more than connect(); use_tls_library(); write(); read(); close(); is insane. You can implement a 90% usecase HTTP client in < 100 lines without golfing. TLS/location header/keep-alive/websockets make that 99%+ and are also fairly straightforward.
Then let’s pick a file at random and take a look: connect.c
entire file is full of nested ifdefs which themselves are nested inside regular flow control
a bunch of ad-hoc string parsing
entire file is littered with platform specific details (maybe this gets a pass because it’s sockets related, but it’s still a lot worse than it could be)
For whatever reason they folded like 20 functions into a single varargs function, so nothing is typechecked, including pointers that the function writes to. So you use int instead of long to store the HTTP response code by accident and curl trashes 4 bytes of memory, and by definition the compiler can’t catch it.
I put curl in the same box as openssl a long time ago. Extremely widely used networking infrastructure, mostly written by one guy in their free time 20 years ago, and kneecapped by its APIs. Kinda surprised it didn’t get any attention during the heartbleed frenzy.
160k loc to do not much more than connect(); use_tls_library(); write(); read(); close(); is insane.
This is really unfair, running curl --help will show you what else curl can do other than just making an HTTPS request. Regardless of whether that makes sense, you can use curl to send and receive emails!!! In an older project, I remember we’ve tried many approaches in order to send and receive emails reliably talking to a variety of email servers with their particular bugs and quirks, and shelling out to curl turned out to be a very robust method…
I interpreted the parent comment’s point about 160k LOC as targetting the fact that most uses of curl are hitting that narrow code path - and therefore most of that 160k LOC is around lesser-used features.
Because curl is semi-ubiquitous, or at least has a stable enough interface and is easy to download and install without major dependency issues, it ends up being used all over the place and relied upon in ways that will never be fully understood.
It’s a great tool, and has made my life so much easier over the years for testing and exploring, but perhaps it’s time for a cut-down tool that does only the bare minimum required for the most common curl use case(s), giving us a way to get the functionality with less risk.
This does seem to confirm something I’ve been thinking about lately while making my own PL, which is that array/slice handling should probably be a feature of all languages, since it is often the thing that we are doing to deal with I/O and it can be error prone. And not just the concept, but an appropriate selection of utility functions for all of the use-cases that one could have with arrays/slices in terms of copying/moving/whatnot. I wouldn’t say that Rust did a stellar job of the last one since their pace of adding library functions is glacial at best. (Professional Rust programmer btw)
It’s really nice when projects do this sort of historical analysis. For reference-sake, I’ve applied CWE (Common Weakness Enumeration) IDs to the categories Daniel identified. The breakdown, with CWE IDs, is:
This is a post by the author of cURL. I think it has a bit more credence than any random “cURL should be rewritten in Rust” post. And from the post:
This post is not meant as a discussion around how we can rewrite C code into other languages to avoid these problems. This is an introspection of the C related vulnerabilities in curl. curl will not be rewritten but will continue to support backends written in other languages.
Regardless of your opinion on “RIIR spam”, this isn’t it.
Having witnessed several “this language will make EVERYTHING secure and reliable” mass hysterias over the years (Java, several C++ standards, Go) I really think it would be a shame to filter things by the [rust] tag. It’s a legitimately neat language and it’s worth keeping up with it even if you don’t use it every day. All technologies have fanboys.
We really need a “thing that triggers interminable C/Rust rants” tag (/snark)
Here are 249 issues with the word null in them . Randomly clicking many of them make it seem like issues that would be resolved with any other language with optionals/nicer null handling.
But that’s over 5 years. So once a week we get this kind of error I guess? Though lots of these seem to be really basic things, so why aren’t the tooling the article mentions catching them? Is there not a way for someone to, I dunno, layer over the typescript type system onto C? Cuz at least Typescript’s typechecker actually works for the most part.
That stat about bugs being present for like 7 years though… yikes. Who knows what’s out there in various tools.
We really need a “thing that triggers interminable C/Rust rants” tag (/snark)
This but unironically. I want it to be very normal for people to write about specific ways that C causes security vulnerabilities in programs, as this blog post’s author does. I want everyone’s immediate association of “C” in the context of programming languages to be the risk of memory-safety-related security bugs. Rust is a very reasonable C alternative for many use cases, and that’s a major reason Rust is an important programming language. But it’s more important for programmers to avoid using C, than for them to switch to Rust specifically. What I don’t want is for the entire enterprise of talking about C-related security vulnerabilities to be seen as suspect because it invites rants. Rants, in this context, are good, and they will continue to be good as long as substantial portions of the software in widespread use around us is written in C.
not many people are writing new programs in C if they can help it
how much value is there in a cp rewrite, really?
The past couple of weeks involved trying to deal with low level stuff under various contexts and I ended up realizing that C does have a legit advantage over Rust (in particular) in that it kinda “just works”. You stuff some c files into a location, run compilation and linking (without needing to define a project and a bunch of noise, or needing to pull in 100 dependencies).
It has a very Python feel to me, in that you really can go wild with macros, and not worry too much about details when trying to get a thing working. It starts falling apart at the seams later, but … I dunno. Bit of a rant. Zig feels the closest to having a simple model for all of this, but I do think it’s important to have a “low expectations” systems language.
not many people are writing new programs in C if they can help it
I’m currently in the middle of porting a fairly niche open-source program written in 2018-2019 from C to Rust, in part because I want to make some changes to it, and I don’t want to deal with C. I don’t have the data to tell you how many people are writing novel C programs today, as a proportion of all programmers writing any kind of program. But people are definitely doing it.
I would not claim that C compilation “just works”. Any C project complex enough to have a Makefile is complex enough to potentially have compilation and linker errors when I download it and try to compile it in my local environment. C also definitely has dependency management, in the sense that if you don’t get your system set up correctly and your gcc flags in the right order, your project will fail to compile with a confusing error message. I’ve debugged plenty of such build processes, and I would much rather deal with cargo in the Rust ecosystem (although I don’t want want to claim that there are never issues running cargo build in any given local environment either - dependency management, linking, etc. are complicated processes that can go wrong in any number of ways in any language).
To be 100% clear on what I was referring to, I was mainly referring to the fixed cost of stuff like cargo. Like, yeah, you’re right C doesn’t actually just work. Neither does Python. But I install requests globally in my Python env and it’s now available everywhere. I kinda like C cuz for really small things you pay very little in fixed cost. You’re basically right in that anything beyond a single file, people should really just pay the cost (unless they have no choice in the matter due to architecture issues or the like)
I think a lot about How C extensions can be built in Python. No fuss at all (you need the right headers installed, but honestly global lib installs are pretty easy). I imagine it might be possible to get similar effects in Rust with rustc? But I haven’t seen it.
mypyc also works by generating a bunch of C files then building them all out, and I think a big part of it being able to work is because you are working on a file-by-file basis.
I guess this is a call to action to see if we can make Rust or variants also work this simply. And looking at the rustc docs, I …. kinda feel like it’s possible! Will try this out in the future.
But I install requests globally in my Python env and it’s now available everywhere.
Yeah, but Python projects frequently use venvs because they depend on a version of Python or of some package that might be different from the system version, or different from what other packages on the same system use. So having a system-wide installation of requests doesn’t do me much good, since I can’t guarantee that any given Python project I want to run can make use of it (or will try to).
The even tougher one is “how much effort does it take to write a portable cp from scratch?”
Even OpenBSD’s cp, which is a lot lighter than the GNU Coreutils implementation, isn’t trivial at all, and it’s seen plenty of fixes.
Looking at something like Heartbleed, it feels like if we would all just take two years of our lives and rewrite all that stuff in a language that isn’t C, things would be much better. Practical experience shows that it would take at least twice as much to even get things working well enough to consider using them in production – let alone weed out the security bugs.
One of my responsibilities at a previous job was running Coverity static analysis on a huge C codebase and following up on issues. It wouldn’t be uncommon to check a new library (not Curl) of 100k lines of code, and find 1000 memory issues. The vast majority would be pretty harmless – 1-byte buffer overflows and such – but then there were always some doozies that could easily lead to RCEs that we’d have to fix. At the point where the vast majority of people working in a language are messing up on a regular basis, it’s the language’s fault, not the people.
For anyone wondering about setting up static analysis for your own codebase, some things to know!
Hope this helps someone who is interested in setting up a software assurance practice using static analysis!
Conversations arguing over the counterfactual of whether using or not using C would have helped are less interesting than acknowledging, whatever language you’re using, there are software assurance techniques you start doing today to increase confidence in your code!
The thing you do to increase confidence doesn’t have to be changing languages. Changing languages can help (and obviously for anyone who’s read my other stuff, I am a big fan of Rust), but it’s also often a big step (I recommend doing it incrementally, or in new components rather than rewrites). So do other stuff! Do static analysis! Strengthen your linting! Do dynamic analysis! More testing! Formal methods for the things that need it! Really, just start.
Excellent point. Coverity is a really, really good tool; I do wish there were an open-source equivalent so more people could learn about static analysis.
Be careful with that. Overflowing a buffer even by a single byte can be exploitable.
Here’s what I find interesting: 42 of those “C mistakes” (which of course are a minority of the real bugs) are actually range errors. Then the other 9 are other. That’s like 82%.
You don’t have to go too far from C to catch those. Like D (yes, I gotta get on the language evangelism bandwagon too!) just bundled the length and pointer into the syntax sugar of
type[]
and then the compiler inserts automatic bounds checks at runtime. Most C functions do a pointer and length anyway so this approach is easy and makes a pretty big improvement.I think other C descendants do something similar. I guess putting that in C itself as an extension would be tricky though since while there’s usually a length around, it isn’t necessarily easy to see which pointer it ties to.
Excellent point, and C written by brilliant paranoid people does exactly that. It’s not common practice though.
This mindset of replacing a language to remove a class of errors is naive at best.
I hate null with a passion and I do think Rust memory safety is a valuable feature. But lets take the biggest problem of that class as an example, the heartbleed bug. If you look at the vulnerability code, it is a very basic mistake. If you took an introductory course in C, you would learn how not to do that.
To argue that it was just a matter of using a language that doesn’t allow for that kind of error is the solution is to defend an impossible solution. Without doubting the good intentions of whomever wrote that piece of code, let us call a spade a spade, it was objectively poor code with basic flaws.
You don’t solve bad engineering by throwing a hack at it such as changing the language. It will manifest itself in the form of other classes of bugs and there is no evidence whatsoever that the outcome isn’t actually worse than the problem one is trying to fix.
Java doesn’t allow one to reference data by its memory address, precisely to avoid this whole class of problems, why isn’t everyone raving about how that magically solved all problems? The answer is: because it obviously didn’t.
I love curl and use it intensively, but this post goes down that whole mindset. Running scripts to find bugs and so on.
I’m not convinced by this argument. Large C and C++ projects seem to always have loads of memory vulns. Either they’re not caused by bad programming or bad programming is inevitable.
I think the core question of whether memory unsafe languages result in more vulnerable code can probably be answered with data. The only review I’m aware of is this fairly short one by a Rust contributor, but there are probably others: https://alexgaynor.net/2020/may/27/science-on-memory-unsafety-and-security/
Good article, the writer sums it up brilliant:
There should be a corollary: until you have the evidence, don’t bother with hypothetical notions that rewriting 10 million lines of C in another language would fix more bugs than it introduces.
Agreed. But nuance is deserved on “both sides” of the argument.
It’s fair to say that rewriting 10 million lines of C in a memory safe language will, in fact, fix more memory bugs than it introduces (because it fix them all and wont introduce any).
It’s also fair to acknowledge that memory bugs are not the only security bugs and that security bugs aren’t the only important bugs.
It’s not fair to say that it’s literally impossible for a C program to ever be totally secure.
My tentative conclusion is this: If your C program is not nominally related to security, itself, then it very likely will become more secure by rewriting in Rust/Zig/Go/whatever. In other words, if there are no crypto or security algorithms implemented in your project, then the only real source of security issues is from C, itself (or your dependencies, of course).
If you C program is related to security in purpose, as in
sudo
, a crypto library, password manager, etc, then the answer is a lot less clear. Many venerable C projects have the advantage of time- they’ve been around forever and have lots of battle testing. It’s likely that if they stay stable and don’t have a lot of code churn that they wont introduce many new security bugs over time.I don’t think this is true. All sorts of programs accept untrusted input, not just crypto or security projects, and almost any code that handles untrusted input will have all sorts of opportunities to be unsafe, regardless of implementation language.
Theoretically yes. But, in practice, if you’re not just passing a user-provided query string into your database, it’s much, MUCH, harder for bad input to pose a security threat. What’s the worst they can do- type such a long string that you OOM? I can be pretty confident that no matter what they type, it’s not going to start writing to arbitrary parts of my process’s memory.
It’s not just databases, tho, it’s any templating or code generation that uses untrusted input.
Do you generate printf format strings, filesystem paths, URLs, HTML, db queries, shell commands, markdown, yaml, config files, etc? If so, you can have escaping issues.
And then there are problems specific to memory unsafety: buffer overturns let you write arbitrary instructions to process memory, etc.
Did you forget that my original comment was specifically claiming that you should not use C because of buffer overruns? So that’s not a counter-point to my comment at all- it’s an argument for it.
My overall assertion was that if you’re writing a program in C, it will almost definitely become more secure if you rewrote it in a memory-safe language, with the exception of programs that are about security things- those programs might already have hard-won wisdom that you’d be giving up in a rewrite, so the trade-off is less clear.
I made a remark that if your C program doesn’t, itself, do “security stuff”, that the only security issues will be from the choice of C. That’s not really correct, as you pointed out- you can surely do something very stupid like passing a user-provided query right to your database, or connect to a user-provided URL, or whatever.
But if that’s the bar we’re setting, then that program definitely has no business being written in C (really at all, but still). There’s certainly no way it’s going to become less secure with a rewrite in a memory-safe language.
Your argument is essentially a form of “victim shaming” where we slap the programmers and tell them to be better and more careful engineers next time.
It is an escapism that stiffles progress by conveniently opting to blame the person making the mistake, rather than the surrounding tools and environment that either enabled, or failed to prevent the error.
It can be applied to all sorts of other contexts including things such as car safety. You could stop making cars safer and just blame the drivers for not paying more attention, going too fast, drink driving, etc…
If we can improve our tools of the trade to reduce or - better yet - eliminate the possibility of mistakes and errors we should do it. If it takes another whole language to do it then so be it.
That’s similar to a car manufacturer using a different engine or chassis because somehow it reduces the accidents because of the properties that it has.
The way we can make that progress is exactly by blaming our “tools” as the “mistake enablers”. Not the person using the tools. Usually they’ve done their best in good faith to avoid a mistake. If they have still made one, that’s an opportunity for improvement of our tools.
Your argument is essentially “you can’t prevent bad engineering or silly programmer errors with technical means; this is a human problem that should be fixed at the human level”. I think this is the wrong way to look at it.
I think it’s all about the programmer’s mental bandwidth; humans are wonderful, intricate, and beautiful biological machines. But in spite of this we’re also pretty flawed and error-prone. Ask someone to do the exact same non-trivial thing every Wednesday afternoon for a year and chances are a large amount of them will fail at least once to follow the instructions exactly. Usually this is okay because most things in life have fairly comfortable error margins and the consequences of failure are non-existent or very small, but for some things it’s a bit different.
This is why checklists are used extensively in aviation; it’s not because the pilots are dumb or inexperienced, it’s because it’s just so damn easy to forget something when dealing with these complex systems, and the margin for error is fairly low if you’re 2km up in the sky and the consequences can be very severe.
C imposes fairly high mental bandwidth: there are a lot of things you need to do “the right way” or you run in to problems. I don’t think anyone is immune to forgetting something on occasion; who knows what happened with the Heartbleed thing; perhaps the programmer got distracted for a few seconds because the cat jumped on the desk, or maybe their spouse asked what they wanted for dinner tonight, or maybe they were in a bad mood that day, or maybe they … just forgot.
Very few people are in top form all day, every day. And if you write code every day then sooner or later you will make a mistake. Maybe it’s only once every five years, but if you’re working on something like OpenSSL the “make a silly mistake once every five years” won’t really cut it, just as it won’t for pilots.
The code is now finished and moves on to the reviewer(s); and the more they need to keep in mind when checking the code the more chance there is they may miss something. Reviewing code is something I already find quite hard even with “easy” languages: how can I be sure that it’s “correct”? Doing a proper review takes almost as much time as writing the code itself (or longer!) The more you need to review/check for every line of code, the bigger the chance is that you’ll miss a mistake like this.
I don’t think that memory safety is some sort of panacea, or that it’s a fix for sloppy programming. But it makes it frees up mental bandwidth and mistakes will be harder and their consequences less severe. It’s just one thing you don’t have to think about, and now you have more space to think about other aspects of the program (including security problems not related to memory safety).
@x64k mentioned PHP in another reply, and this suffers from the same problem; I’ve seen critical security fixes which consist of changing
in_array($list, $item)
toin_array($list, $item, true)
. That last parameters enable strict type checking (so that"1" == 1
is false). The root cause of these issues is the same as in C: it imposes too much bandwidth to get it right, every time, all the time.NULLs have the same issue: you need to think “can this be NULL?” every time. It’s not a hard question, but it’s sooner or later you’ll get it wrong and asking it all the time takes up a lot of bandwidth probably best spent elsewhere.
Java does magically solve all memory problems. People did rave about garbage collection: garbage collection is in fact revolutionary.
That was a long time ago so lots of people don’t remember what OP is talking about anymore. The claim wasn’t that Java would magically solve all memory problems. That was back when the whole “scripting vs. systems” language dichotomy was all the rage and everyone thought everything would be written in TCL, Scheme or whatever in ten years or so. There was a more or less general expectation (read: lots of marketing material, since Java was commercially-backed, but certainly no shortage of independent tech evangelists) that, without pointers, all problems would go away – no more security issues, no more crashes and so on.
Unsurprisingly, neither of those happened, and Java software turned out to be crash-prone in its own unpleasant ways (there was a joke about how you can close a Java program if you can’t find the quit button: wiggle the mouse around, it’ll eventually throw an unhandled exception) in addition to good ol’ language-agnostic programmer error.
Yet somehow the cURL person/people made the mistake. Things slip by.
That, actually, was one of the the biggest selling points of Java to C++ devs. It’s probably the biggest reason that Java is still such a dominant language today.
I also take issue with your whole message. You say that you can’t fix bad engineering by throwing a new language at it. But that’s an over generalization of the arguments being made. You literally can fix bad memory engineering by using a language that doesn’t allow it, whether that’s Java or Rust. In the meantime, you offer no solution other than “don’t do this thing that history has shown is effectively unavoidable in any sufficiently large and long-lived C program”. So what do you suggest instead? Or are we just going to wait for heartbleed 2.0 and act surprised that it happened yet again in a C program?
Further, you throw out a complaint that we can’t prove that rewriting in Rust (or whatever) won’t make things worse than they currently are. We live in the real world- you can’t prove lots of things, but is there any reason to actually suspect that this is realistically possible?
I’d rather say that your post is, charitably, naive at best (and it continuing to dominate the conversation is an unfortunate consequence of the removal of the ability to flag down egregiously incorrect posts, sadly).
Do you really believe that the OpenSSL programmers (whatever else you can say about that project) lack an introductory knowledge of C? Do you feel the linux kernel devs, who have made identical mistakes, similarly lack an introductory knowledge of C? Nginx devs? Apache? Etc, etc.
This is an extraordinary claim.
Probably one of the most successful fields in history at profoundly reducing, and keeping low, error rates has been the Aviation industry, and the lesson of their success is that you don’t solve human errors by insisting that the people who made the errors would have known not to if they’d just taken an introductory course they’d already long covered, or in general just be more perfect.
The Aviation industry realized that humans, no matter how well tutored and disciplined and focused, inevitably will still make mistakes, and that the only thing that reduces errors is looking at the mistakes that are made and then changing the system to account for those mistakes and reduce or eliminate their ability to recur.
When your dogma leads to you making extraordinary (indeed, ludicrous) claims, and the historical evidence points the polar opposite of the attitude you’re preaching being the successful approach, it’s past time to start reconsidering your premise.
I’d like to stress that this is only one part of Aviation’s approach, at least as driven by the FAA and the NTSB in the US. The FAA also strives to create a culture of safety, by mandating investigations into incidents, requiring regular medical checkups depending on your pilot rating, releasing accident findings often, incentivizing record-keeping on both aircraft (maintenance books) and pilots (logbooks), encouraging pilots to share anonymous information on incidents that occurred with minor aircraft or no passenger impact, and many more. This isn’t as simple as tweaking the system. It’s about prioritizing safety at every step of the conversation.
A fair point. And of course all of this flies directly in the face of just yelling “be more perfect at using the deadly tools!” at people.
Yup, I meant this more to demonstrate what it takes to increase safety in an organized endeavor.
Dropping the discussion of “the problem is human nature” in this comment. I’m explicitly not rhetorically commenting on it or implying such.
These “other parts”, and culture of safety - how would we translate that across into programming? Actually, come to think of it that’s probably not the first question. The first question is, is it possible to translate that across into programming?
I think it’s fair to say that in e.g. webdev people flat-out just value developer velocity over aerospace levels of safety because (I presume) faster development is simply more valuable in webdev than it is in aerospace - if the thing crashes every tuesday you’ll lose money, but you won’t lose that much money. So, maybe it’s impractical to construct such a culture. Maybe. I don’t know.
But, supposing it is practical, what are we talking about? Record-keeping sounds like encouraging people to blog about minor accidents, I guess? But people posting blogs is useless if you don’t have some social structure for discussing the stuff, and I’m not sure what the analogous social structure would be here.
“Prioritizing safety at every step of the conversation” sounds like being able to say no to your boss without worry.
“This isn’t as simple as tweaking the system” sounds like you’re saying “treat this seriously and stop chronicly underserving it both financially and politically”, which sounds to me like “aim for the high-hanging fruit of potential problems”, which I don’t think anyone with the word “monetize” job description will ever remotely consider.
What are the low-hanging fruit options in this “stop excessively focusing on low-hanging fruit options” mindset you speak of?
Actually, it sounds like that sort of thing would need sort of government intervention in IT security or massive consumer backlash. Or more likely both, with the latter causing the former.
It most certainly is. The “easiest” place to see evidence of this is to look into fields of high-reliability computing. Computing for power plants, aviation, medical devices, or space are all good examples. A step down would be cloud providers that do their best to provide high availability guarantees. These providers also spend a lot of engineering effort + processes in emphasizing reliability.
Thing of the postmortem process posted on the blogs of the big cloud providers. This is a lot like the accident reports that the NTSB releases after accident investigation. I think outside of the context of a single entity coding for a unified goal (whether that’s an affiliation of friends, a co-op, or a company), it’s tough to create a “culture” of any sort, because in different contexts of computing, different tradeoffs are desired. After all, I doubt you need a high reliability process to write a simple script.
You’d be surprised how many organizations, both monetizing and not, have this issue. Processes become ossified; change is hard. Aiming for high-hanging fruit is expensive. But a mix of long-term thinking and short-term thinking is always the key to making good decisions, and in computing it’s no different. You have to push for change if you’re pushing against a current trend of unsafety.
There needs to be a feedback mechanism between failure of the system and engineers creating the system. Once that feedback is in place, safety can be prioritized over time. Or at least, this is one way I’ve seen this done. There are probably many paths out there.
This here is the core problem. Honestly, there’s no reason to hold most software to a very high standard. If you’re writing code to scrape the weather from time to time from some online API and push it to a billboard, meh. What software needs to do is get a lot better about prioritizing safety in the applications that require it (and yes, that will require some debate in the community to come up with applications that require this safety, and yes there will probably be different schools of thought as there always are). I feel that security is a minimum, but beyond that, it’s all application specific. Perhaps the thing software needs the most now is just pedagogy on operating and coding with safety in mind.
Google’s SRE program and the SRE book being published for free are poster examples of promoting a culture of software reliability.
Yes, you absolutely do. One thing you can rely on is that humans will make mistakes. Even if they are the best, even if you pay them the most, even if you ride their ass 24 hours a day. Languages that make certain kinds of common mistakes uncommon or impossible save us from ourselves. All other things being equal, you’d be a fool not to choose a safer language.
I wrote a crypto library in C. It’s small, only 2K lines of code. I’ve been very diligent every step of the way (save one, which I paid dearly). I reviewed the code several times over. There was even an external audit. And very recently, I’ve basically fixed dead code. Copying a whopping 1KB, allocating and wiping a whole buffer, wasting lines of code, for no benefit whatsoever. Objectively a poor piece of code with a basic flaw.
I’m very careful with my library, and overall I’m very proud of its overall quality; but sometimes I’m just tired.
(As for why it wasn’t noticed: as bad as it was, the old code was correct, so it didn’t trigger any error.)
All programmers are bad programmers then, otherwise why do we need compiler error messages?
Software apparently can’t just be written correctly the first time.
I’m half joking here but, indeed, if language-level memory safety were all it takes for secure software to happen, we could have been saved ages ago. We didn’t have to wait for Go, or Rust, or Zig to pop up. A memory-safe language with no null pointers, where buffer overflow, double-frees and use-after-free bugs are impossible, has been available for more than 20 years now, and the security track record of applications written in that language is a very useful lesson. That language is PHP.
I’m not arguing that (re)writing curl in Go or Rust wouldn’t eventually lead to a program with fewer vulnerabilities in this particular class, I’m just arguing that “this program is written in C and therefore not trustworthy because C is an unsafe language” is, at best, silly. PHP 4 was safer than Rust and boy do I not want to go back to dealing with PHP 4 applications.
Now of course one may argue that, just like half of curl’s vulnerabilities are C mistakes, half of those vulnerabilities were PHP 4 mistakes. But in that case, it seems a little unwise to wager that, ten years from now, we won’t have any “half of X vulnerabilities are Rust mistakes” blog posts…
Language-level anything isn’t all it takes, but from my experience they do help and they help much more than “a little”, and… I’ll split this in two.
The thing I’ve done that found the largest number of bugs ever was when I once wrote a script to look for methods (in a >100kloc code base) that that three properties: a) Each method accepted at least one pointer parameter b) contained null in the code and c) did not mention null in the documentation for that method. Did that find all null-related errors? Far from it, and there were several false positives for each bug, and many of the bugs weren’t serious, but I used the output to fix many bugs in just a couple of days.
Did this fix all bugs related to null pointers? No, not even nearly. Could I have found and fixed them in other ways? Yes, I could. The other ways would have been slower, though. The script (or let’s call it a query) augmented my capability, in much the same way as many modern techniques augment programmers.
And this brings me to the second part.
We have many techniques that do do nothing capable programmers can’t do. (I’ve written assembly language without any written specification, other documentation, unit tests or dedicated testers, and the code ran in production and worked. It can be done.)
That doesn’t mean that these techniques are superfluous. Capable programmers are short of time and attention; techniques that use CPU cycles, RAM and files, and that save brain time are generally a net gain.
That includes safe languages, but also things like linting, code queries, unit tests, writing documentation and fuzzing (or other white-noise tests). I’d say it also includes code review, which can be described as using other ream members’ attention to reduce the total attention needed to deliver features/fix bugs.
Saying “this program is safe because it has been fuzzed” or “because it uses unit tests” doersn’t make sense. But “this program is unsafe because it does not use anything more than programmer brains” makes sense and is at least a reasonable starting assumption.
(The example I used above was a code query. A customer reported a bug, I hacked together a code query to find similar possible trouble spots, and found many. select functions from code where …)
PHP – like Go, Rust and many others out there – also doesn’t use anything more than programmer brains to avoid off-by-one errors, for example, which is one of the most common causes of bugs with or without security implications. Yet nobody rushes to claim that programs written in one of these languages are inherently unsafe because they rely on nothing but programmer brains to find such bugs.
As I mentioned above: I’m not saying these things don’t matter, of course they do. But conflating memory safety with software security or reliability is a bad idea. There’s tons of memory-safe code out there that has so many CVEs it’s not even funny.
Who is doing this? The title of the OP is explicitly not conflating memory safety with software security. Like, can you find anyone with any kind of credibility conflating these things? Are there actually credible people saying, “curl would not have any CVEs if it were written in a memory safe language”?
It is absolutely amazing to me how often this straw man comes up.
EDIT: I use the word “credible” because you can probably find a person somewhere on the Internet making comments that support almost any kind of position, no matter how ridiculous. So “credible” in this context might mean, “an author or maintainer of software the other people actually use.” Or similarish. But I do mean “credible” in a broad sense. It doesn’t have to be some kind of authority. Basically, someone with some kind of stake in the game.
Just a few days ago there was a story on the lobster.rs front page whose author’s chief complaint about Linux was that its security was “not ideal”, the first reason for that being that “Linux is written in C, [which] makes security bugs rather common and, more importantly, means that a bug in one part of the code can impact any other part of the code. Nothing is secure unless everything is secure.” (Edit: which, to be clear, was in specific contrast to some Wayland compositor being written in Rust).
Yeah, I’m tired of it, too. I like and use Rust but I really dislike the “evangelism taskforce” aspect of its community.
I suppose “nothing is secure unless everything is secure” is probably conflating things. But saying that C makes security bugs more common doesn’t necessarily. In any case, is this person credible? Are they writing software that other people use?
I guess I just don’t understand why people spend so much time attacking a straw man. (Do you even agree that it is a straw man?) If someone made this conflation in a Rust space, for example, folks would be very quick to correct them that Rust doesn’t solve all security problems. Rust’s thesis is that it reduces them. Sometimes people get confused either because they don’t understand or because none are so enthusiastic as the newly converted. But I can’t remember anyone with credibility making this conflation.
Like, sure, if you see someone conflating memory safety with all types of security vulnerabilities, then absolutely point it out. But I don’t think it makes sense to talk about that conflation as a general phenomenon that is driving any sort of action. Instead, what’s driving that action is the thesis that many security vulnerabilities are indeed related to memory safety problems, and that using tools which reduce those problems in turn can eventually lead to more secure software. While some people disagree with that, it takes a much more nuance argument and it sounds a lot less ridiculous than the straw man you’re tearing down.
I’m more tired of people complaining about the “evangelism taskforce.” I see a lot more of that than I do the RESF.
Sorry, I think I should have made the context more obvious. I mean, let me start with this one, because I’d also like to clarify that a) I think Rust is good and b) that, as far as this particular debate is concerned, I think writing new things in Rust rather than C or especially C++ is a good idea in almost every case:
What, that experienced software developers who know and understand Rust are effectively claiming that Rust is magic security/reliability dust? Oh yeah, I absolutely agree that it’s bollocks, I’ve seen very few people who know Rust and have more than a few years of real-life development experience in a commercial setting making that claim with a straight face. There are exceptions but that’s true of every technology.
But when it comes to the strike force part, here’s the thing:
…on the other hand, for a few years now it feels like outside Rust spaces, you can barely mention an OS kernel or a linker or a window manager or (just from a few days ago!) a sound daemon without someone showing up saying ugh, C, yeah, this is completely insecure, I wouldn’t touch it with a ten-foot pole. Most of the time it’s at least plausible, but sometimes it’s outright ridiculous – you see the “not written in Rust” complaint stuck on software that has to run on platforms Rust doesn’t even support, or that was started ten years ago and so on.
Most of them aren’t credible by your own standards or mine, of course, but they’re part of the Rust community whether they’re representative of the “authoritative” claims made by the Rust developers or not.
Fair enough. Thanks for the reply!
As I mentioned in my below comment on this article, this is a good thing. I want people who decide to write a novel sound daemon in C to see those sorts of comments, and (ideally) rethink the decision to write a novel C program to begin with. Again, this doesn’t necessarily imply that Rust is the right choice of language for any given project, but it’s a strong contender right now.
Even now though, there still is significant tension between “don’t use C” and “make it portable”. Especially if you’re targetting embedded, or unknown platforms. C is still king of the hill as far as portability goes.
What we really want is dethrone C at its own game: make something that eventually becomes even more portable. That’s possible: we could target C as a backend, and we could formally specify the language so it’s clear what’s a compiler bug (not to mention the possibility of writing formally verified compilers). Rust isn’t there yet.
One of the (many) reasons I don’t use C and use Rust instead is because it’s easier to write portable programs. I believe every Rust program I’ve written also works on Windows, and that has nearly come for free. Certainly a lot cheaper than if I had written it in C. I suppose people use “portable” to mean different things, but without qualification, your dichotomy doesn’t actually seem like a dichotomy. I suppose the dichotomy is more, “don’t use C” and “make it portable to niche platforms”?
I think we (as in both me and the parent poster) were talking about different kinds of portability. One of the many reasons why most of the software I work on is (still) in C rather than Rust is that, while every Rust program I’ve written works on Windows, lots of the ones I need have to work on architectures that are, at best, Tier 2. Proposing that we ship something compiled with a toolchain that’s only “guaranteed to build” would at best get me laughed at.
Yes. The point I’m making is that using the word “portable” unqualified is missing the fact that Rust lets you target one of the most popular platforms in the world at a considerably lower cost in lots of common cases. It makes the trade off being made more slanted than what folks probably intend by holding up “portability” as a ubiquitously good thing. Well, if we’re going to do that, we should acknowledge that there is a very large world beyond POSIX and embedded, and that world is primarily Windows.
For the record, if I were writing desktop GUI applications or games, of course the only relevant platforms are Windows, Linux, and MacOSX. Or Android and iOS, if the application is meant for palmtops. From there “portability” just means I chose middleware that have backends for all the platforms I care about. Rust, with its expanded standard library, does have an edge.
If however I’m writing a widely applicable library (like a crypto library), then Rust suddenly don’t look so good any more. Because I know for a fact that many people still work on platforms that Rust doesn’t support yet. Not to mention the build dependency on Rust itself. So either I still use C, and I have more reach, or I use Rust, and I have more safety (not by much if I test my C code correctly).
Of course, my C library is also going to target Windows. Not doing so would defeat the point.
I don’t think I strongly disagree with anything here. It’s just when folks say things like this
I would say, “welllll I’m not so sure about that, because I can reach a lot more people with less effort using Rust than I can with C.” Because if I use C, I now need to write my own compatibility layer between my application and the OS in order to support a particularly popular platform: Windows.
And of course, this depends on your target audience, the problem you’re solving and oodles of other things, as you point out. But there’s a bit of nuance here because of how general the word “portable” is.
Yeah, I was really talking about I/O free libraries. I believe programs should be organised in 3 layers:
I believe C is still a strong contender for the bottom layer. There specifically, it is still the king of portability. For the middleware and the top layer however, the portability of the language means almost nothing, so it’s much harder to defend using C there.
Also note that the I/O free libraries can easily be application specific, and not intended for wider distribution. In that case, C also loses its edge, as (i) portability matters much less, and (ii) it’s still easier to use a safe language than write a properly paranoid test suite.
So does C#. Targeting popularity does not make you portable.
I was talking about “runs on a 16-bit micro controller as well as an 64-bit monster”. The kind where you might not have any I/O, or even a heap allocator. The kind where you avoid undefined behaviour and unspecified behaviour and implementation defined behaviour.
Hard, but possible for some programs. Crypto libraries (without the RNG) for instance are pure computation, and can conform to that highly restricted setting. I’ll even go a bit further: I think over 95% of programs can be pure computation, and be fully separated from the rest (I/O, system calls, networking and all that).
If you want to print stuff on a terminal, portability drops. If you want to talk to the network or draw pixels on the screen, portability in C is flat out impossible, because the required capabilities aren’t in the standard library. I hear Rust fares far better in that department.
Open BSD has tier 3 support for Rust, which basically means no support. You assess how “niche” OpenBSD really is, especially in a security context.
Yes, I would absolutely say OpenBSD is a niche platform. I generally don’t care if the stuff I write works on OpenBSD. I just don’t. I care a lot more that it runs on Windows though. If more people used OpenBSD, then I’d care more. That’s the only reason I care about Windows. It’s where the people are.
Niche doesn’t mean unimportant.
To pop up a level, I was making a very narrow point on a particular choice of wording. Namely, that “Rust isn’t as portable as C” is glossing over some really significant nuance depending on what you’re trying to do. If all you’re trying to do is distribute a CLI application, then Rust might not let you target as many platforms as easily as C, but it might let you reach more people with a lot less effort.
What in the world makes you think they haven’t considered that question and concluded that C, for all its shortcomings, was nonetheless their best option?
If the conclusion is C, their thinking is wrong.
Porting Rust to a platform sounds more achievable than writing correct software in C, so the only thing ridiculous is that people think “I haven’t ported it” is a valid excuse.
Lots of people. Search for “heartbleed C” or “heartbleed memory safety” or “heartbleed rust”.
They are not credible to me if they make such absurd claims, but they exist in very large numbers. They won’t claim that all problems would go away, but they all point out that heartbleed woulnd’t happen if openssl was written in Rust (for example). Yes, there are hundreds of such claims on the web. Thousands probably. As if a basic error like the one that led to heartbleed could only take the form of a memory safety problem.
As for credibility. I don’t find your definition very useful. There is a lot of software used by millions, much of it genuinely useful that is still badly engineered. I don’t think popularity is a good indicator for credibility.
Can you actually show me someone who is claiming that all security vulnerabilities will be fixed by using Rust or some other memory safe language that would meet your standard of credibility if it weren’t for that statement itself?
I tried your search queries and I found nobody saying or implying something like, “using a memory safe language will prevent all CVEs.”
This is a very specific claim though. For the sake of argument, if someone were wrong about that specific case, that doesn’t mean they are conflating memory safety for all security vulnerabilities. That’s what I’m responding to.
So propose a new one? Sheesh. Dispense with the pointless nitpicking and move the discussion forward. My definition doesn’t require something to be popular. I think I was pretty clear in my comment what I was trying to achieve by using the word “credible.” Especially after my edit. I even explicitly said that I was trying to use it in a very broad sense. So if you want to disagree, fine. Then propose something better. Unless you have no standard of credibility. In which case, I suppose we’re at an impasse.
I never made such claim. You are insisting in the whole “prevent all CVEs”. That is an extreme point that I never made, nor did any other people in this thread. If you take to that extreme, then sure you are right. I never claimed that people say that Rust will magically do their laundry either. Please let’s keep the discussion to a level of reasonability so it stays fruitful.
FWIW, for “heartbleed rust”, google returns this in the first page:
All these are absurd. It is not my point to shame or blame. I have no idea who the author of the heartbleed offending code is. And in all honest we all have made mistakes. But let’s not take relativism to the absurd. Let’s be clear, it was objectively very poorly written code with a trivial error. A seasoned engineer should look at that and immediately see the problem. If you think that level of quality is less problematic when you use a ‘safer’ language, you are in for very bad surprises. It was objectively bad engineering, nothing less. The language had nothing to do with that. A problem with the same severity would have the same probability to occur in Rust, it would just take other form. The claims on the titles I quoted from my google are silly. If you jump from a plane without a parachute you also prevent the whole class of accidents that happen when the screen opens up. I am sure people understand that that is a silly claim.
Again, no one is claiming that rust community is conflating memory safety with “all safety vulns”. I have no clue where you got that from. But to the point, it is as specific as it is pointless, as is the parachute example.
I didn’t say you did. But that’s the claim I’m responding to.
I didn’t, no. I was responding to it by pointing out that it’s a straw man. It sounds like you agree. Which was my central point.
No, they did:
The rest of your comment is just so far removed from any reality that I know that I don’t even know how to engage with it. It sounds like you’re in the “humans just need to be better” camp. I’m sure you’ve heard the various arguments about why that’s not a particularly productive position to take. I don’t have any new arguments to present.
Specifically, I (sort of) claimed that and expanded upon it here. And I’m just going to re-emphasise what I said in that comment.
I see from your profile that you’re a member of the Rust library team – I imagine most of the interactions you have within the Rust community are with people who are actively involved in building the Rust environment. That is, people who have the expertise (both with Rust and other aspects software development), the skill, and the free time to make substantial contributions to a game-changing technology, and who are therefore extremely unlikely to claim anything of that sort.
So I understand why this looks like a straw man argument to you – but this is not the Rust-related interaction that many people have. I was gonna say “most” but who knows, maybe I just got dealt a bad hand.
Most of “us” (who don’t know/use Rust or who, like me, don’t use it that much) know it via the armchair engineering crowd that sits on the sides, sneers at software like the Linux kernel and casually dismisses it as insecure just for being written in C, with the obvious undertone that writing it in Rust would make it secure. Like this or like this or like this.
They’re not dismissing it as memory unsafe and the undertone isn’t that (re)writing it in Rust would plug the memory safety holes. When they propose that some 25 year-old piece of software be rewritten in Rust today, the idea, really, is that if you start today, you’ll have something that’s more secure, whatever the means, in like one or two years.
That’s why there are people who want RIIR flags. Not to avoid useful discussion with knowledgeable members of the Rust community like you, but to avoid dismissive comments from the well ackshually crowd who thinks about something for all of thirty seconds and then knows exactly where and why someone is wrong about a project they’ve been working on for five years.
Not necessarily. It depends on the day. I am also on the moderation team. So I tend to get a narrow view on library matters and a very broad view on everything else. But I also frequent r/rust (not an “official” Rust space), in addition to HN and Lobsters.
Certainly. I am under no illusion about that. I don’t think you or anyone was saying “core Rust engineers have made ridiculous claim Foo.” That’s why I was asking for more data. I wanted to hear about any credible person who was making those claims.
FWIW, you didn’t just do this with Rust. You kinda did it with Java too in another comment:
I mean, like, really? All problems? I might give you that there were maybe some marketing materials that, by virtue of omission, gave that impression that Java solved “all problems.” But, a “general expectation”? I was around back then too, and I don’t remember anything resembling that.
But, like, Java did solve some problems. It came with some of its own, not all of which were as deeply explored as they are today.
See, the thing is, when you say hyperbolic things like this, it makes your side of the argument a lot easier to make. Because making this sort of argument paints the opposing side as patently ridiculous, and this in turn removes the need to address the nuance in these arguments.
Again. If you see someone with a misconception like this—they no doubt exist—then kindly point it out. But talking about it as a sort of general phenomenon just seems so misguided to me. Unless it really is a general phenomenon, in which case, I’d expect to be able to observe at least someone building software that others use on the premise that switching to Rust will fix all of their security problems. Instead, what we see are folks like the curl author making a very careful analysis of the trade offs involved here. With data.
RE https://news.ycombinator.com/item?id=25921917: Yup, that’s a troll comment from my perspective. If I had seen it, I would have flagged it.
RE https://news.ycombinator.com/threads?id=xvilka: I can’t tell if they’re a troll, but they definitely post low effort comments. I’d downvote most of them if I saw them. I downvote almost any comment that is entirely, “Why didn’t you write it in X language?” Regretably, it can be a legitimate question for beginners to ask, since a beginner’s view of the world is just so narrow, nearly by definition.
RE https://news.ycombinator.com/item?id=26398042: Kinda more of the above.
I note that the first and third links you gave were downvoted quite a bit. So that seems like the system is working. And that there aren’t hordes of people secretly in favor of comments like that and upvoting them.
FWIW, I don’t recognize any of these people as Rust community members. Or rather, I don’t recognize their handles. And as a moderator, I am at least passively aware of pretty much anyone that frequents Rust spaces. Because I have to skim a lot of content.
I’m not sure why we have descended into an RESF debate. For every RESF comment you show, I could show you another anti-RESF Lobsters’ comment.
It’s just amazing to me that folks cannot distinguish between the zeal of the newly converted and the actual substance of the idea itself. Like, look at this comment in this very thread. Calling this post “RIIR spam,” even though it’s clearly not.
But that’s very different than what you said before. It doesn’t necessarily conflate memory safety with security. That’s a more nuanced representation of the argument and it is much harder to easily knock down (if at all). It’s at least true enough that multiple groups of people with a financial stake have made a bet on that being true. A reasonable interpretation of “more secure” is “using Rust will fix most or nearly all of the security vulnerabilities that we have as a result of memory unsafety.” Can using Rust also introduce new security vulnerabilities unrelated to memory safety by virtue of the rewrite? Absolutely. But whether this is true or not, and to what extent, really depends on a number of nuanced factors.
The “well ackshually” crowd exists pretty much everywhere. Rust perhaps has a higher concentration of them right now because it’s still new. But they’re always going to be around. I’ve been downvoting and arguing with the “well ackshually” crowd for years before I even knew what Rust was.
If you see a dismissive comment that isn’t contributing to the discussion, regardless of whether it’s about RIIR or not, flag it. I have absolutely no problem with folks pointing out low effort RESF comments that express blind enthusiasm for a technology. Trade offs should always be accounted for. My problem is that the anti-RESF crowd is not doing just that. They are also using it as a bludgeon against almost anything that involves switching to Rust. This very post, by Curl’s author, is not some low effort RESF bullshit. (I should say that not all RESF bullshit is trolling. I’ve come across a few folks that are just new to the world of programming. So they just don’t know how to see the nuance in things yet, even if it’s explicitly stated. There’s only so much novel signal a brain can take in at any point. Unfortunately, it’s difficult to differentiate between sincere but misguided beginners and trolls. Maybe there are people other than trolls and beginners posting RESF bullshit, but I don’t actually know who they are.)
Either way, if we get RIIR flags, then we should get anti-RIIR flags. See where that leads? Nowhere good. Because people can’t seem to differentiate between flippant comments and substance.
Sorry I got a bit ranty, but this whole thread is just bush league IMO.
No, I’m aware of that. But the RESF bullshit posters have a bit of a history with Curl: https://daniel.haxx.se/blog/2017/03/27/curl-is-c/ .
I try to keep away from these bush league threads myself but, uh, sometimes you just go with it, and this was one of those cases precisely because of that context.
I’ve been slowly trying to nudge people into using Rust on embedded systems ever since I gave up on Ada, so I’m not trying to dismiss it, I have quite an active interest in it. Yet I’ve been at the receiving end of “you should rewrite that in a safe language” many times, too, like most people writing firmware. And I don’t mean on lobster.rs (which has a level-headed audience, mostly :P), I mean IRL, too. Nine times out of ten these discussions are bullshit.
That’s because nine times out of ten they’re not carried out with people who are really knowledgeable about Rust and firmware development. E.g. I get long lectures about how it’ll vastly improve firmware reliability by eliminating double frees and dangling pointers. When I try to point out that this is true in general, and that Rust’s memory model is generally helpful in embedded systems (e.g. the borrow checker is great!) but this particular problem is a non-issue because this is an embedded system and all allocations are static and we never get a double free because we don’t even malloc! I get long lectures about how two years from now everything will be AArch64 anyway and memory space won’t be an issue.
(Edit: to be clear – I definitely don’t support “RIIR” flags or anything of the sort, and indeed “the system” works, as in, when one of the RIIR trolls pop up, they get downvoted into oblivion, whether they’re deliberately trolling or just don’t know better. I’m just trying to explain where some of the negativity comes from, and why in my personal experience it’s often try to hold back on it even when you actually like Rust and want to use it more!)
Oh, yeah, that was my first exposure to hype, and it gave me a healthy dose of skepticism towards tech publications. I got to witness that as a part of the (budding, in my part of the world) tech journalism scene (and then, to some degree, through my first programming gigs). The cycle went basically as follows:
There were lots of talks and articles and books on Java between ’95 and ’97-‘98 (that was somewhat before my time but that’s the material I learned Java from later) that always opened with two things: it’s super portable (JVM!) and there are no pointers, so Java programs are less likely to crash due to bad memory accesses and are less likely to have security problems.
These were completely level-headed and obviously correct. Experienced programmers got it and even those who didn’t use Java nonetheless emulated some of the good ideas in their own environments.
Then 2-4 years later we got hit by all the interns and lovely USENET flamers who’d grown up on stories they didn’t really understand about Java and didn’t really qualify these statements.
So then I spent about two years politely fending off suggestions about articles on how net appliances and thin clients are using Java because it’s more secure and stable, on why everyone is moving to Java and C++ will only be used for legacy applications and so on – largely because I really didn’t understand these things well enough, but man am I glad I let modesty get the better of me. Most of my colleagues didn’t budge, either, but there was a period during which I read a “Java programs don’t crash” article every month because at least one – otherwise quite respectable – magazine would publish one.
Aye. Thanks for sharing. I totally get your perspective. Talking about your projects with folks only to have to get into an exhausting debate that you’ve had umpteen times already is frustrating. Happens to me all the time too, for things outside of Rust. So I know the feeling. It happens in the form of, “why didn’t you do X instead of Y?” Depending on how its phrased, it can feel like a low brow dismissal. The problem is that that line of questioning is also a really great way of getting a better understanding of the thing you’re looking at in a way that fits into your own mental model of the world. Like for example, at work I use Elasticsearch. If I see a new project use SOLR, I’m going to be naturally curious as to why they chose it over Elasticsearch. I don’t give two poops about either one personally, but maybe they have some insight into the two that I don’t have, and updating my mental model would be nice. The problem is that asking the obvious question comes across as a dismissal. It’s unfortunate. (Of course, sometimes it is a dismissal. It’s not always asked in good faith. Sometimes it’s coupled with a healthy dose of snobbery, and those folks can just fuck right off.)
It’s harder to do IRL, but the technique I’ve adopted is that when someone asks me questions like that, I put about as much effort into the response as they did the question. If they’re earnest and just trying to understand, then my hope is that they might ask more follow up questions, and then it might become a nice teaching moment. But most of the time, it’s not.
I guess I would just re-iterate that my main issue with the anti-RIIR crowd is that it’s overbroad. If it were just some groaning about trolls, then fine. But it’s brought up pretty much any time Rust is brought up, even if bringing Rust up is appropriate.
But I suppose that’s the state of the Internet these days. Tribes are everywhere and culture wars can’t be stopped.
What form would it take? Would it be in the “all private keys in use on the internet can be leaked”-form? Probably not, I think?
Anyway, let’s forget about security for a minute; why wouldn’t you want the computer to automate memory management for you? Our entire job as programmers is automate things and the more things are automated, the better as it’s less work for us. This is why we write programs and scripts in the first place.
Traditionally automated memory management has come with some trade-offs (i.e. runtime performance hits due to GCs) and Rust attempts to find a solution which automates things without these drawbacks. This seems like a good idea to me, because it’s just more convenient: I want the computer to do as much work for me as possible; that’s its job.
Back to security: if I have a door that could be securely locked by just pressing a button vs. a door that can be securely locked by some complicated procedure, then the first door would be more secure as it’s easier to use. Sooner or later people will invariable make some mistake in the second door’s procedure. Does that mean the first door guarantees security? No, of course not. You might forget to close the window, or you might even forget to press that button. But it sure reduces the number of things you need to do for a secure locking, and the chances you get it right are higher.
Kinda sorta. PHP is/was largely written in C itself and IIRC had its share of memory related security bugs from just feeding naughty data to PHP standard library functions.
I don’t know what PHP is like today from that point of view.
So, I take issue when you say:
I still think it’s completely justifiable to be skeptical of a program written in C. Just because another language may also be bad/insecure/whatever does not invalidate the statement or sentiment that C is a dangerous language that honestly brings almost nothing to the table in the modern era.
Interestingly, there’s a different language from a very similar time as C, that has more safety features, and its name is Pascal. And there was a time when they were kinda competing, as far as I understood. This was maybe especially at the time of Turbo C and Turbo Pascal, and then also Delphi. Somehow C won, with the main argument being I believe “because performance”, at least that’s how I remember it. My impression is that quite often, when faced with a performance vs. security choice, “the market” chooses performance over security. I don’t have hard data as to whether code written in Pascal was more secure than that in C; I’d be curious to see some comparison like that. I seem to have a purely anecdotal memory, than when I felt some software was remarkably stable, it tended to show up to be written in Pascal. Obviously it was still totally possible to write programs with bugs in Pascal; I think Delphi code had some characteristic kind of error messages that I saw often enough to learn to recognize them. Notably, it also actually still required manual memory management - but I believe it was better guarded against buffer overruns etc. than C.
I thought the reasons for C’s wide adoption were Unix and the university system. I.E., the universities were turning out grads who knew Unix and C. I’ve only heard good things about the performance of Turbo Pascal.
Pascal is safer, but it was certainly possible to write buggy Pascal. Back in the early 90s I hung out on bulletin boards and played a lot of Trade Wars 2002. That was written in Turbo Pascal, and it had a few notable and widely exploited bugs over the years. One such was a signed overflow of a 16-bit integer. I won a couple Trade Wars games by exploiting those kinds of bugs.
You are arguing that eliminating one class of bugs doesn’t make sense, because there are other classes of bugs? That reminds me of the mental gymnastics of untyped language proponents.
This cartoon deserves an entire blog post! But I’ll just list out the gymnastic routines of statically-typed language proponents:
I think no one cares about this.
Well, not when they only spend 3 minutes per link. You might have to read and reflect.
The thing is that you are completely missing the point (and your kind of nerd-contrarianism doesn’t make you look as smart as you think it does).
Oh! I don’t do this to look smart. I do this because I see in you the same tribalism that I once had, and I only grew past it because of evidence like the links I shared with you. I’m not trying to say that static typing is wrong; I’m trying to expand and enrich your knowledge about type theory. Once you’ve reached a certain altitude and vantage, then you’ll see that static and dynamic typing are not tribes which live in opposition, but ways of looking at the universal behaviors of computation.
Please, read and reflect. Otherwise this entire thread was off-topic: Your first post is not a reply to its parent, but a tangent that allowed you to display your tribal affiliation. I don’t mind being off-topic as long as it provides a chance to improve discourse.
The (popular) mistake you are making is that you pretend things are equal when they are not, just like shitting your pants (untyped) and trying to not shit your pants (typed) are not positions with similar merit.
No those are all serious refutations of why statically-typed languages is a panacea and is a lot more insightful than a silly comic someone made to find affirmation among their Twitter followers.
Can you show me where I claimed that “typed languages [are] a panacea”?
Anyway, have fun stomping on that strawman.
Here are the only economically viable solutions I see to the problem of “too much core infrastructure has recurring, exploitable memory unsafety bugs” problem (mainly because of C):
Manually rewriting code in Rust or any other language is NOT on that list. It would be nice but I think it’s a fantasy. There’s simply too much code, and too few people to rewrite it.
Moreover with code like bash, to a first approximation there’s 1 person who understands it well enough to rewrite it (and even that person doesn’t really understand his own code from years ago, and that’s about what we should expect, given the situation).
Also, the most infamous bash vulnerability (ShellShock) was not related to memory unsafety at all. Memory unsafety is really a subset of the problem with core infrastructure.
Sometime around 2020 I made a claim that in 2030 the majority of your kernel and your browser will still be in C or C++ (not to mention most of your phone’s low level stack, etc.).
Knowing what the incentives are and the level of resources devoted to the problem, I think we’re still on track for that.
I’m honestly interested if anyone would take the opposite side: we can migrate more than 50% of our critical common infrastructure by 2030.
This says nothing about new projects written in Rust of course. For core infrastructure, the memory safety + lack of GC could make it a great choice. But to a large degree we’ll still be using old code. Software and especially low level infrastructure has really severe network effects.
I agree with you. It’s just not going to happen that we actually replace most of the C code that is out there. My hope, however, is that C (and C++) becomes the next COBOL in the sense that it still exists, there are still people paid to work on systems written in it, but nobody is starting new projects in it.
Along the same lines as your first bullet point, I think a big step forward- and the best “bang for our buck” will be people doing fuzz testing on all of these old C projects. There was just recently a story making the rounds here and on HN about some bug in… sudo? maybe? that lead to a revelation of the fact that the commit on the project that caused the regression was a bugfix to a bug for which there was no test, before or after the change. So, not only did the change introduce a bug that wasn’t caught by a test that didn’t exist- we can’t even be sure that it really fixed the issue it claimed to, or that we really understood the issue, or whatever.
My point is that these projects probably “should” be rewritten in Rust or Zig or whatever. But there’s much lower hanging fruit. Just throw these code bases through some sanitizers, fuzzers, whatever.
Yeah that’s basically what OSS Fuzz has been doing since 2016. Basically throwing a little money at projects to integrate continuous fuzzing. I haven’t heard many updates on it but in principle it seems like the right thing.
https://github.com/google/oss-fuzz
The obvious question is what percent of curl’s vulnerabilities could be found this way. I googled and found this:
https://github.com/curl/curl-fuzzer
which doesn’t look particularly active, and doesn’t seem to show any results (?).
Rather than talk about curl and “RIIR” (which isn’t going to happen soon even if the maintainer wants it to), it would be better to talk about if curl is doing everything it can along the other lines.
Someone mentioned that curl is the epitome of bad 90’s C code, and I’ve seen a lot of that myself. There is a lot of diversity in the quality of C code out there, and often the sloppiest C projects have a lot of users.
Prominent examples are bash, PHP, Apache, etc. They code fast and sloppy and are responsive to their users.
There’s a fundamental economic problem that a lot of these discussion are missing. Possible/impossible or feasible/infeasible is one thing; whether it will actually happen is a different story.
Bottom line is that I think there should be more talk about projects along these lines, more talk about sandboxing and principle of least privilege, and less talk about “RIIR”.
Nah. This kinda assumes that the rewrite/replacement/whatever would happen due to technical reasons.
It certainly wouldn’t. If a kernel/library/application gets replaced by a safer implementation, it’s for business reasons, where it just happens that the replacement is written in e. g. Rust.
So yes, I fully expect that a certain amount of rewrites to happen, just not for the reasons you think.
I mean, there is no silver bullet, right? We can all agree with that? So, therefore, “just apply more sagacious thinking” isn’t going to fix anything just as “switch to <Rust|D|C#|&C>” won’t? The focus on tools seems to miss the truth that this is a human factors problem, and a technological solution isn’t going to actually work.
One of curl’s main advantages is its ubiquity; I can run it on an OpenWRT router, a 32bit ARMv7 OpenBSD machine, a POWER9 machine, and even Illumos//OpenIndiana. It’s a universal toolkit. It also runs in extremely constrained and underpowered environments.
Do you know of a memory-safe language that fits the bill (portability and a tiny footprint)? Rust fails on the former and Java fails on the latter. Go might work (gccgo and cgo combined have a lot of targets and TinyGo can work in constrained environments), but nowhere as well as C.
There is java for smartcards . . .
Nim, ats.
How would you deal with plan interference? The linked paper requires an entirely new language in order to even talk about this class of bugs!
When somebody pretends that C is fine for security-critical software because “anybody as smart as me doesn’t write ${x} bugs”, there’s no point arguing with them. Their logic is correct, the problem is that their assumptions are wrong; nobody is as good at programming as they think they are.
I don’t disagree that curl would be better off in rust, but curl is really a shining example of 90s super bloaty/legitimately terrible C code. Any rewrite would eliminate half its vulns (probably more), including a rewrite in C.
Bloat:
More code = more bugs and 160k loc to do not much more than
connect(); use_tls_library(); write(); read(); close();
is insane. You can implement a 90% usecase HTTP client in < 100 lines without golfing. TLS/location header/keep-alive/websockets make that 99%+ and are also fairly straightforward.Then let’s pick a file at random and take a look: connect.c
Finally let’s take a look at the API: curl_easy_getinfo
For whatever reason they folded like 20 functions into a single varargs function, so nothing is typechecked, including pointers that the function writes to. So you use
int
instead oflong
to store the HTTP response code by accident and curl trashes 4 bytes of memory, and by definition the compiler can’t catch it.I put curl in the same box as openssl a long time ago. Extremely widely used networking infrastructure, mostly written by one guy in their free time 20 years ago, and kneecapped by its APIs. Kinda surprised it didn’t get any attention during the heartbleed frenzy.
This is really unfair, running
curl --help
will show you what elsecurl
can do other than just making an HTTPS request. Regardless of whether that makes sense, you can usecurl
to send and receive emails!!! In an older project, I remember we’ve tried many approaches in order to send and receive emails reliably talking to a variety of email servers with their particular bugs and quirks, and shelling out to curl turned out to be a very robust method…Indeed it can, so when you build it as a library you have to use configure flags like
–enable-static –disable-shared –disable-ftp –disable-file –disable-ldap –disable-ldaps –disable-rtsp –disable-proxy –disable-dict –disable-telnet –disable-tftp –disable-pop3 –disable-imap –disable-smb –disable-smtp –disable-gopher –disable-manual –disable-libcurl-option –enable-pthreads –disable-sspi –disable-crypto-auth –disable-ntlm-wb –disable-tls-srp –disable-unix-sockets –disable-cookies –without-pic –without-zlib –without-brotli –without-default-ssl-backend –without-winssl –without-darwinssl –without-ssl –without-gnutls –without-polarssl –without-cyassl –without-wolfssl –without-mesalink –without-nss –without-axtls –without-ca-bundle –without-ca-path –without-ca-fallback –without-libpsl –without-libmetalink –without-librtmp –without-winidn –without-libidn2 –without-nghttp2
I interpreted the parent comment’s point about 160k LOC as targetting the fact that most uses of curl are hitting that narrow code path - and therefore most of that 160k LOC is around lesser-used features.
Because curl is semi-ubiquitous, or at least has a stable enough interface and is easy to download and install without major dependency issues, it ends up being used all over the place and relied upon in ways that will never be fully understood.
It’s a great tool, and has made my life so much easier over the years for testing and exploring, but perhaps it’s time for a cut-down tool that does only the bare minimum required for the most common curl use case(s), giving us a way to get the functionality with less risk.
edit: Of course there had to be many such tools already in existence! Here’s one that’s nice and small: https://gitlab.com/davidjpeacock/kurly
This does seem to confirm something I’ve been thinking about lately while making my own PL, which is that array/slice handling should probably be a feature of all languages, since it is often the thing that we are doing to deal with I/O and it can be error prone. And not just the concept, but an appropriate selection of utility functions for all of the use-cases that one could have with arrays/slices in terms of copying/moving/whatnot. I wouldn’t say that Rust did a stellar job of the last one since their pace of adding library functions is glacial at best. (Professional Rust programmer btw)
Some langs are going in this direction, like Alan https://docs.alan-lang.org/about_alan.html
It’s really nice when projects do this sort of historical analysis. For reference-sake, I’ve applied CWE (Common Weakness Enumeration) IDs to the categories Daniel identified. The breakdown, with CWE IDs, is:
(The curl security problems page does provide CWE IDs for each vulnerability, which is excellent)
Is there a filter in lobsters that allows me to grep away the RIIR spam? It starts to hit my nerves
This is a post by the author of cURL. I think it has a bit more credence than any random “cURL should be rewritten in Rust” post. And from the post:
Regardless of your opinion on “RIIR spam”, this isn’t it.
This article mentions rust once, and it’s really just as an example of an alternative to C. Calling this RIIR spam is missing the point.
You can filter the [rust] tag, but that doesn’t help with drive-by attacks by the RESF…
Having witnessed several “this language will make EVERYTHING secure and reliable” mass hysterias over the years (Java, several C++ standards, Go) I really think it would be a shame to filter things by the [rust] tag. It’s a legitimately neat language and it’s worth keeping up with it even if you don’t use it every day. All technologies have fanboys.
He very specifically mentions this isn’t about a rewrite…
We really need a “thing that triggers interminable C/Rust rants” tag (/snark)
Here are 249 issues with the word null in them . Randomly clicking many of them make it seem like issues that would be resolved with any other language with optionals/nicer null handling.
But that’s over 5 years. So once a week we get this kind of error I guess? Though lots of these seem to be really basic things, so why aren’t the tooling the article mentions catching them? Is there not a way for someone to, I dunno, layer over the typescript type system onto C? Cuz at least Typescript’s typechecker actually works for the most part.
That stat about bugs being present for like 7 years though… yikes. Who knows what’s out there in various tools.
This but unironically. I want it to be very normal for people to write about specific ways that C causes security vulnerabilities in programs, as this blog post’s author does. I want everyone’s immediate association of “C” in the context of programming languages to be the risk of memory-safety-related security bugs. Rust is a very reasonable C alternative for many use cases, and that’s a major reason Rust is an important programming language. But it’s more important for programmers to avoid using C, than for them to switch to Rust specifically. What I don’t want is for the entire enterprise of talking about C-related security vulnerabilities to be seen as suspect because it invites rants. Rants, in this context, are good, and they will continue to be good as long as substantial portions of the software in widespread use around us is written in C.
Reading this post made me realize some stuff…
The past couple of weeks involved trying to deal with low level stuff under various contexts and I ended up realizing that C does have a legit advantage over Rust (in particular) in that it kinda “just works”. You stuff some c files into a location, run compilation and linking (without needing to define a project and a bunch of noise, or needing to pull in 100 dependencies).
It has a very Python feel to me, in that you really can go wild with macros, and not worry too much about details when trying to get a thing working. It starts falling apart at the seams later, but … I dunno. Bit of a rant. Zig feels the closest to having a simple model for all of this, but I do think it’s important to have a “low expectations” systems language.
I’m currently in the middle of porting a fairly niche open-source program written in 2018-2019 from C to Rust, in part because I want to make some changes to it, and I don’t want to deal with C. I don’t have the data to tell you how many people are writing novel C programs today, as a proportion of all programmers writing any kind of program. But people are definitely doing it.
I would not claim that C compilation “just works”. Any C project complex enough to have a Makefile is complex enough to potentially have compilation and linker errors when I download it and try to compile it in my local environment. C also definitely has dependency management, in the sense that if you don’t get your system set up correctly and your
gcc
flags in the right order, your project will fail to compile with a confusing error message. I’ve debugged plenty of such build processes, and I would much rather deal with cargo in the Rust ecosystem (although I don’t want want to claim that there are never issues runningcargo build
in any given local environment either - dependency management, linking, etc. are complicated processes that can go wrong in any number of ways in any language).To be 100% clear on what I was referring to, I was mainly referring to the fixed cost of stuff like cargo. Like, yeah, you’re right C doesn’t actually just work. Neither does Python. But I install
requests
globally in my Python env and it’s now available everywhere. I kinda like C cuz for really small things you pay very little in fixed cost. You’re basically right in that anything beyond a single file, people should really just pay the cost (unless they have no choice in the matter due to architecture issues or the like)I think a lot about How C extensions can be built in Python. No fuss at all (you need the right headers installed, but honestly global lib installs are pretty easy). I imagine it might be possible to get similar effects in Rust with
rustc
? But I haven’t seen it.mypyc
also works by generating a bunch of C files then building them all out, and I think a big part of it being able to work is because you are working on a file-by-file basis.I guess this is a call to action to see if we can make Rust or variants also work this simply. And looking at the
rustc
docs, I …. kinda feel like it’s possible! Will try this out in the future.Yeah, but Python projects frequently use venvs because they depend on a version of Python or of some package that might be different from the system version, or different from what other packages on the same system use. So having a system-wide installation of requests doesn’t do me much good, since I can’t guarantee that any given Python project I want to run can make use of it (or will try to).
The even tougher one is “how much effort does it take to write a portable
cp
from scratch?”Even OpenBSD’s
cp
, which is a lot lighter than the GNU Coreutils implementation, isn’t trivial at all, and it’s seen plenty of fixes.Looking at something like Heartbleed, it feels like if we would all just take two years of our lives and rewrite all that stuff in a language that isn’t C, things would be much better. Practical experience shows that it would take at least twice as much to even get things working well enough to consider using them in production – let alone weed out the security bugs.
https://github.com/1ma/CurlAda