Thank you for saying this. I’d hoped people would extend the basic assumption of competence toward me and understand that the bugs as presented were significantly more obscure when manifesting as odd parse behavior far removed from the source, but experience posting this article elsewhere has sadly dashed those hopes.
I totally understand. I hate hate hate bugs that manifest randomly or only long after the damage is done! My defense consists of deploying every available compiler warning, static checker, and runtime sanitizer to flush them out quickly. But sadly these are not on by default.
I see you changed the title to ”Two C++ bugs I wrote.” :)
A former manager of mine once compared C++ to “a double-edged razor blade”, and while it’s gotten better since 1991 it still has that nature.
The null-termination thing is of course a legacy from C. Strings are kind of a mess in C++ and I still find myself using the char* variety at times because there are APIs that require it. I’ve just learned to be very vigilant about null-termination when doing such conversions.
The vector thing is IMHO a result of C++ defaulting to performance over safety and not providing a good compile-time toggle switch. Library calls ought to be range-checked in debug builds, with illegal behavior causing immediate termination. Every decent C++ program has build flags for debug vs release, but it’s not really built into the language and libraries — a few things like assert obey the NDEBUG macro but most of the library doesn’t. Libc++ has a little-known flag to enable range checks, and the UB sanitizer helps a lot too, and the Address Sanitizer will catch cases where the code goes off into the weeds. I consider those requirements for developing and testing C++ code.
I would love to switch to a safer language, I just haven’t found the ideal one for me yet! Swift seems to come closest although it’s too far from bare-metal for all use cases.
I feel like the pop_back example is a really good example of the ways C++ really suffers from how late std::option got added to the language and isn’t really baked in the way e.g. Rust is, where they’re considered a fundamental building block, and serves as a sign of how C++ really suffers from it’s poor initial conceptualization as just being C with classes. I don’t think it’s a foot gun that will ever fully go away.
Not really. pop_back() doesn’t return the value it pops, it just deletes it. So there’s no real excuse for this being UB besides “just use it right, nerd”.
From now on I will go to tremendous lengths to only ever write low-level system code in Rust or SPARK Ada
There’s tools like Infer which might be able to find problems like what you had. It’s frustrating to have to resort to such things in C++ though, as it further complicates project setup. Rust’s “it works out of the box” wins pretty hands down in that department. Despite investing so late in open source support and being so tiny, the Ada community’s “it works out of the box”, and my experience with it has me taking seriously the papers describing productivity in the language. I do a lot of C++ work and love the language–it’s incredible the things you can do with it, but there’s a also lot of ways to make mistakes, especially in silent and unexpected ways.
UB in general in programming languages seems to avoid pigeon-holing implementations and avoids answering difficult questions about what an implementation should do. Ada has undefined behavior, they just call it “unspecified behavior.” The FSF GNAT Ada compiler does a pretty good job preventing you from shooting yourself in the foot, I only did so a few times in two years, usually when interfacing with C code. I’ve never knowingly run into issues with Rust on this point, but in general with Rust and Ada, “if it compiles, it works.”
Bug #1: the perils of null-terminated strings
Ada strings are just arrays that have bounds checked length. Since you can mix Ada and the SPARK subset, you can do stuff like have a simple regex proved to never access out of bounds. Since Ada allows VLA, string code appears super easy since you can just return them from functions, but uses Latin-1 and ends with long names like String, Wide_String, Wide_Wide_String, and really long resizable string types like Unbounded_Wide_Wide_String. I found them ridiculously easy to use, and with built-in pre/post conditions pretty difficult to screw up. The language predates unicode, there’s 3rd party libs for UTF-8.
I always feel like I’m daisy chaining power adapter plugs like an electrical traveling salesman walk when doing string code in Rust, but the UTF-8 support is great and things seem to “just work” when I get it done. The code lines end up a bit shorten than Ada too. I get it to work and then let the ecosystem tools teach me how it really should be done :)
I recently wrote a whole bunch of utf8 string processing code in Rust with a ton of index arithmetic. Ended up defining some “unit” wrapper classes to distinguish byte offsets from codepoint offsets with safe methods of converting between them. I was surprised something like that didn’t already exist; maybe it does but I couldn’t find it. It was certainly a lifesaver though: https://github.com/tlaplus-community/tlauc/blob/main/src/strmeasure.rs
I’ve written my fair share of bugs in C/C++ as well. It happens fast and what makes them annoying is that they are only randomly observable. Also, often times they disappear when you run your code in a debugger, for some reason. I’ve learned that while the language is not really designed to aid the programmer with these bugs in any way, there is incredible tooling out there to make up for it.
Some useful tools that you could have used to track these bugs down more easily, potentially:
Running your code in Valgrind helps find a lot of issues at runtime and is simple to use. Valgrind hooks into calls such as memory allocation and it emulates your code, so it find memory unsafety issues that would normally silently work. Another bonus is that Valgrind can emulate CPUs and their caches, so you can also use to track how well your code performs.
Building your code with LLVM sanitizers (MemorySanitizer, AddressSanitizer, UndefinedSanitizer). These add instrumentation at compile time. Bit more involved to set up, but they can detect things like out-of-bounds array reads and writes, undefined behaviour (branches on uninitialized variables, integer wraparound). Incredibly helpful.
For my pet open-source project Passgen, I use both of these in the CI to make sure my unit tests don’t do anything stupid. As long as you have decent test coverage, that seems to be pretty good strategy. You can try them by running make asan, make msan or make usan in the repo, assuming you have the right dependencies.
I looked up a while ago whether std::string was required to store date null terminated, and I believe it has been since C++11, where data() and c_str() are required to return the same thing. If the string is not stored null terminated then c_str() can trigger memory allocation and invalidate a prior call to data(). In C++20, c_str() is declared noexcept and so may not allocate memory (it may not return null, so there is no way for it to handle allocation failure). It’s also required to be constant time, so can’t copy (in general, there’s some leeway there for short strings, but the prior point about memory allocation still applies). I’m not sure why you’d use a vector of char instead of a string for character data.
std::string wasn’t included in the hardcoded emscripten imports, so I couldn’t use it. It certainly would have solved my problems! I suppose you could say a large contributor to that bug was not understanding the difference between std::vector<char> and std::string, particularly as they relate to the behavior of atoi. I lean more toward the interpretation that the design of these languages and their standard library lend them inevitably to this sort of misuse. I make no claim to be a genius, although feel confident saying I’m a competent programmer. I was bit by these all the same.
All bugs seem simpler after they’ve been found and clearly described, and they’re not a mystery crash among thousands of lines of code any more.
Thank you for saying this. I’d hoped people would extend the basic assumption of competence toward me and understand that the bugs as presented were significantly more obscure when manifesting as odd parse behavior far removed from the source, but experience posting this article elsewhere has sadly dashed those hopes.
I totally understand. I hate hate hate bugs that manifest randomly or only long after the damage is done! My defense consists of deploying every available compiler warning, static checker, and runtime sanitizer to flush them out quickly. But sadly these are not on by default.
I see you changed the title to ”Two C++ bugs I wrote.” :)
A former manager of mine once compared C++ to “a double-edged razor blade”, and while it’s gotten better since 1991 it still has that nature.
The null-termination thing is of course a legacy from C. Strings are kind of a mess in C++ and I still find myself using the
char*
variety at times because there are APIs that require it. I’ve just learned to be very vigilant about null-termination when doing such conversions.The vector thing is IMHO a result of C++ defaulting to performance over safety and not providing a good compile-time toggle switch. Library calls ought to be range-checked in debug builds, with illegal behavior causing immediate termination. Every decent C++ program has build flags for debug vs release, but it’s not really built into the language and libraries — a few things like
assert
obey the NDEBUG macro but most of the library doesn’t. Libc++ has a little-known flag to enable range checks, and the UB sanitizer helps a lot too, and the Address Sanitizer will catch cases where the code goes off into the weeds. I consider those requirements for developing and testing C++ code.I would love to switch to a safer language, I just haven’t found the ideal one for me yet! Swift seems to come closest although it’s too far from bare-metal for all use cases.
I feel like the pop_back example is a really good example of the ways C++ really suffers from how late std::option got added to the language and isn’t really baked in the way e.g. Rust is, where they’re considered a fundamental building block, and serves as a sign of how C++ really suffers from it’s poor initial conceptualization as just being C with classes. I don’t think it’s a foot gun that will ever fully go away.
Not really.
pop_back()
doesn’t return the value it pops, it just deletes it. So there’s no real excuse for this being UB besides “just use it right, nerd”.Huh. So it just exists to screw you. Great…
Welcome to the C and C++ stdlibs!
There’s tools like Infer which might be able to find problems like what you had. It’s frustrating to have to resort to such things in C++ though, as it further complicates project setup. Rust’s “it works out of the box” wins pretty hands down in that department. Despite investing so late in open source support and being so tiny, the Ada community’s “it works out of the box”, and my experience with it has me taking seriously the papers describing productivity in the language. I do a lot of C++ work and love the language–it’s incredible the things you can do with it, but there’s a also lot of ways to make mistakes, especially in silent and unexpected ways.
UB in general in programming languages seems to avoid pigeon-holing implementations and avoids answering difficult questions about what an implementation should do. Ada has undefined behavior, they just call it “unspecified behavior.” The FSF GNAT Ada compiler does a pretty good job preventing you from shooting yourself in the foot, I only did so a few times in two years, usually when interfacing with C code. I’ve never knowingly run into issues with Rust on this point, but in general with Rust and Ada, “if it compiles, it works.”
Ada strings are just arrays that have bounds checked length. Since you can mix Ada and the SPARK subset, you can do stuff like have a simple regex proved to never access out of bounds. Since Ada allows VLA, string code appears super easy since you can just return them from functions, but uses Latin-1 and ends with long names like
String
,Wide_String
,Wide_Wide_String
, and really long resizable string types likeUnbounded_Wide_Wide_String
. I found them ridiculously easy to use, and with built-in pre/post conditions pretty difficult to screw up. The language predates unicode, there’s 3rd party libs for UTF-8.I always feel like I’m daisy chaining power adapter plugs like an electrical traveling salesman walk when doing string code in Rust, but the UTF-8 support is great and things seem to “just work” when I get it done. The code lines end up a bit shorten than Ada too. I get it to work and then let the ecosystem tools teach me how it really should be done :)
I recently wrote a whole bunch of utf8 string processing code in Rust with a ton of index arithmetic. Ended up defining some “unit” wrapper classes to distinguish byte offsets from codepoint offsets with safe methods of converting between them. I was surprised something like that didn’t already exist; maybe it does but I couldn’t find it. It was certainly a lifesaver though: https://github.com/tlaplus-community/tlauc/blob/main/src/strmeasure.rs
I’ve written my fair share of bugs in C/C++ as well. It happens fast and what makes them annoying is that they are only randomly observable. Also, often times they disappear when you run your code in a debugger, for some reason. I’ve learned that while the language is not really designed to aid the programmer with these bugs in any way, there is incredible tooling out there to make up for it.
Some useful tools that you could have used to track these bugs down more easily, potentially:
For my pet open-source project Passgen, I use both of these in the CI to make sure my unit tests don’t do anything stupid. As long as you have decent test coverage, that seems to be pretty good strategy. You can try them by running
make asan
,make msan
ormake usan
in the repo, assuming you have the right dependencies.I looked up a while ago whether std::string was required to store date null terminated, and I believe it has been since C++11, where data() and c_str() are required to return the same thing. If the string is not stored null terminated then c_str() can trigger memory allocation and invalidate a prior call to data(). In C++20, c_str() is declared noexcept and so may not allocate memory (it may not return null, so there is no way for it to handle allocation failure). It’s also required to be constant time, so can’t copy (in general, there’s some leeway there for short strings, but the prior point about memory allocation still applies). I’m not sure why you’d use a vector of char instead of a string for character data.
std::string
wasn’t included in the hardcoded emscripten imports, so I couldn’t use it. It certainly would have solved my problems! I suppose you could say a large contributor to that bug was not understanding the difference betweenstd::vector<char>
andstd::string
, particularly as they relate to the behavior ofatoi
. I lean more toward the interpretation that the design of these languages and their standard library lend them inevitably to this sort of misuse. I make no claim to be a genius, although feel confident saying I’m a competent programmer. I was bit by these all the same.