Is this a provocative thesis? “There are two types of ${tool} users: people who chose ${tool} because they liked various of its properties, or people who used ${tool} because it was their best or only option at the time” - seems true for pretty much any tool?
And the iron law applies; the folks who refine C and evolve it over time are all fans of C, rather than merely tolerating it as one possible tool in a design space.
So so true. I spent about 10 years of my life trying to find something better than C for programming stuff that needed strong control over memory. You can do it in C#, Lisp, etc but it requires incredibly detailed knowledge of the implementation.
C++? The adoption/learning curve is so shallow — for instance you can keep writing C code but just use (zone or more of) std::string, std::vector and std::unique_ptr; and most of your memory management code and its bugs go away. And of course use new/delete/malloc/free if you really need to.
After writing an OS in C++, I really hate having to go back to C. I have to write far more code in C, and (worse) I have to think about the same things all of the time. Even just having RAII saves me a huge amount of effort (smart pointers are a big part, but so is having locks released at the end of a scope). For systems code, data structure choice is critical and C++ makes it so much easier to prototype with one thing, profile usage patterns, and then replace it with something else later.
Would you think it would be helpful to have a C w/ Lisp-style metaprogramming to implement higher-level constructs that compile down to C? And then you use what you’re willing to pay for or put up with?
One that got into semi-usable form was ZL Language which hints at many possibilities in C, C++, and Lisp/Scheme.
Since you wanted destructors, I also found smart pointers for C whose quality I couldn’t evaluate because I don’t program in C. It looked readable at least. There’s been many implementations of OOP patterns in C, too. I don’t have a list of them but many are on StackOverflow.
Or you could just use a language that’s widely supported by multiple compilers and has these features. I implemented a bunch of these things in C, but there were always corner cases where they didn’t work, or where they required compiler-specific extensions that made them hard to port. Eventually I realised I was just implementing a bad version of C++.
In the example that you link to, for example, it uses the same attribute that I’ve used for RAII locks and for lexically scoped buffers. From the perspective of gcc, these are just pointers. If you assign the value to another pointer-type variable, it will not raise an error and will give you a dangling pointer. Without the ability to overload assignment (and, ideally, move), you can’t implement these things robustly.
C++ metaprogramming got a lot better with constexpr and the ability to use structural types as template arguments. The only thing that it lacks that I want is the ability to generate top-level declarations with user-defined names from templates.
The only thing that it lacks that I want is the ability to generate top-level declarations with user-defined names from templates.
Can you elaborate on this? I wonder if it’s related to a problem I have at work.
We have a lot of std::variant-like types, like using BoolExpr = std::variant<Atom, Conjunction, Disjunction>;. But the concise name BoolExpr is only an alias: the actual symbol names use the full std::variant<Atom, Conjunction, Disjunction>. Some of these variants have dozens of cases, so any related function/method names get reaallly long!
I think I would want a language feature like “the true name of std::variant<Atom, Conjunction, Disjunction> is BoolExpr”. Maybe this would be related to explicit template instantiation: you could declare this in bool_expr.h and it would be an error to instantiate std::variant<Atom, Conjunction, Disjunction> anywhere else.
The main thing for me is exposing things to C. I can use X macros to create a load of variants of a function that use a name and a type in their instantiations, but I can’t do that with templates alone. Similarly, I can create explicit template instantiations in a file (so that they can be extern in the header) individually, but I can’t write a template that declares extern templates for a given template over a set of types and another that generates the code for them in my module.
The reflection working group has a bunch of proposals to address these things and I’ve been expecting them to make it into the next standard since C++17 was released. Maybe C++26…
I think platforms without C++ support are dying out. Adding a back end for your target to LLVM is cheaper than writing a C compiler and so there’s little incentive to not support C++ (and Rust). The BFA model might work, but I’d have to see the quality of the code that it generated. Often these tools end up triggering UB, which is a problem, or leave the kind of microoptimisations that are critical to embedded systems out and impossible to add in the generated code.
You couldn’t implement unique_ptr in 2002 with the semantics that it has today. unique_ptr requires language support for move semantics in order to give you that uniqueness promise automatically and move semantics came to C++ in 2011.
but it requires incredibly detailed knowledge of the implementation.
Not just that - you actively need to work around the problems and limitations of the runtime. E.g. when garbage collection bogs your application down, you need to start creating object pools. Hence, you end up manually managing memory again - precisely the thing you tried to avoid in the first place. Many runtimes do not let you run the garbage collector manually or specify fine-granular garbage collection settings. In addition to that, an update to the runtime (which you often do not control because it’s just the runtime that is installed on the user’s machine) can ruin all your memory optimizations again and send you back to square one, which is a heavy maintenance burden. It just doesn’t make any sense to use these languages for anything that requires fine-grained control over the execution. Frankly, it doesn’t make any sense to use these languages at all if you know C++ or Rust, unless the platform forces you to use then (like the web pretty much forces you to use JavaScript if you want to write code that is compatible with most browsers).
It’s been a long time (over a decade) since I had to deal with GC problems being noticeable at an application level. A lot of these problems disappeared on their own, as computers became faster and GC algorithms moved from the drawing boards into the data center. (I was working on software for trading systems, algo trading, stock market engines, operational data stores, and large scale distributed systems. Mostly written in Java.)
In the late 90s, GC pauses were horrendous. By the late aughts, GC pauses were mostly manageable, and we had had enough time to work around the worst causes of them. Nowadays, pauseless GC algorithms are becoming the norm.
I still work in C and C++ when necessary, just like I still repair junk around the house by hand. It’s possible. Sure, it would be far cheaper and faster to order something brand new from China, but there’s a certain joy in wasting a weekend trying to do what should be a 5 minute fix. Similarly, it’s sometimes interesting to spend 40 person years (e.g. a team of 10 over a 4 year period) on a software project in C++ that would take a team of 5 people maybe 3 months to do in Go. Of course, there are still a handful of projects that actually need to be built in C or C++ (or Rust or Zig, I guess), but that is such a tiny portion of the software industry at this point, and those people already know who they are and why they have to do what they have to do.
You said “It just doesn’t make any sense to use these languages for anything that requires fine-grained control over the execution.” But how many applications still require that level of fine grained control?
For literally decades, people have been saying that GC has now improved so much that it’s become unnoticeable, and every single time I return to try it, I encounter uncontrollable, erratic runtime behavior and poor performance. Unless you write some quick and dirty toy program to plot 100 points, you will notice it one way or another. Try writing a game in JavaScript - you still have to do object pooling. Or look at Minecraft - the amount of memory the JVM allocates and then frees during garbage collection is crazy. Show me a garbage collector and I’ll show you a nasty corner case where it breaks down.
Similarly, it’s sometimes interesting to spend 40 person years (e.g. a team of 10 over a 4 year period) on a software project in C++ that would take a team of 5 people maybe 3 months to do in Go.
Okay, I’m not a big C++ fan but this is obviously flamebait. Not even gonna comment on it further.
But how many applications still require that level of fine grained control?
A lot. Embedded software, operating systems, realtime buses, audio and video applications… Frankly, I have a hard time coming up with something I worked on that doesn’t require it. Not to mention, even if the application doesn’t strictly require it, a GC is still intrinsically wasteful, making the software run worse, especially on weaker machines. And even if we say performance doesn’t matter, using languages with GC encourages bad and convoluted design and incoherent lifetime management. So, no matter how you look at it, GC is a bad deal.
Okay, I’m not a big C++ fan but this is obviously flamebait. Not even gonna comment on it further.
I managed a large engineering organization at BigTechCoInc for a number of years, and kept track (as closely as possible) of technical projects and what languages they used and what results they had. Among other languages we used in quantity: C, C++, Java, C#. (Other languages too including both Python and JS on the back end, but not enough to draw any clear conclusions.) The cost per delivered function point was super high in C++ compared to everything else (including C). C tended to be cheaper than C++ because it seemd to be used mostly for smaller projects, or (I believe) on more mature code bases making incremental changes; I think if we tried building something new and huge in C, it may have been as expensive as the C++ projects, but that never happened. Java and C# are very similar languages, and had very similar cost levels, much lower than C or C++, and while I didn’t run any Go projects, I have heard from peers that Go costs significantly less than Java for development (but I don’t know about long term maintenance costs). One project I managed was implemented nearly simultaneously in C++, C#, and Java, which was quite illuminating. I also compared notes with peers at Amazon, Facebook, Google, Twitter, Microsoft, eBay, NYSE (etc.), and lots of different financial services firms, and their anecdotal results were all reasonably similar to mine. The two largest code bases for us were Java and C++, and the cost with C++ was an order of magnitude greater than Java.
Embedded software, operating systems, realtime buses, audio and video applications
Sure. Like I said: “Of course, there are still a handful of projects that actually need to be built in C or C++ (or Rust or Zig, I guess), but that is such a tiny portion of the software industry at this point, and those people already know who they are and why they have to do what they have to do.”
Or look at Minecraft - the amount of memory the JVM allocates and then frees during garbage collection is crazy.
This is absolutely true. The fact that Java works at all is a freaking miracle. The fact that it manages not to fall over with sustained allocation rates of gigabytes per second (mostly all tiny objects, too!) is amazing. That Minecraft succeeded is a bit shocking in retrospect.
Very interesting. Do you have more fine-grained knowledge about the cost per delivered function point with respect to C++? Is the additional cost caused by debugging crashes, memory leaks, etc.? Is it caused by additional training and learning or tooling and build systems? Does the usage of modern C++ idioms make a difference? Or does everything simply take longer, death by a thousand cuts?
Specifically, looking at areas that Java was able to leverage:
gc (enabled cross component memory management without RAII)
simpler builds
elimination of header file complexity
binary standard for build outputs
dynamic linking as a concept well-supported by the language
good portability
more rigid type system defined
reflection (enabling more powerful libraries)
elimination of pointers, buffer over-runs, etc.
My thinking has evolved in the subsequent decade, but there are some key things in that list that really show the pain points in C++, specifically around the difficulty of re-using libraries and components. But the other thing that’s important to keep in mind is that the form of applications has changed dramatically over time: An app used to be a file (.bin .com .exe whatever). Then it was a small set of files (some .so or .dll files and some data files in addition to the executable). And at some point, the libraries went from being 1% of the app to 99% of the app.
Just like Java/C# ate C++’s lunch in the “Internet application” era, some “newer” platforms (the modern browser plus the phone OSs) show how ill equipped Java/C# are, although I think that stuff like React and Node (JS) are just interim steps (impressive but poorly thought out) toward an inevitable shift in how we think about applications.
Anyhow, it’s a very interesting topic, and I wish I had more time to devote to thinking about this kind of topic than just doing the day job, but that’s life.
I’m going to go into opinion / editorial mode now, so please discount accordingly.
C++ isn’t one language. It’s lots of different languages under one umbrella name. While it’s super powerful, and can do literally everything, that lack of “one true way to do everything” really seems to hurt it in larger teams, because within a large team, no subgroups end up using the same exact language.
C++ libraries are nowhere near as mature (either in the libraries themselves, or in the ease of using them randomly in a project) as in other languages. It’s very common in other languages to drag in different libraries arbitrarily as necessary, and you don’t generally have to worry about them conflicting somehow (even though I guess they might occasionally conflict). In C++, you generally get burnt so badly by trying to use any library other than boost that you never try again. So then you end up having to build everything from scratch, on every project.
Tooling (including builds) is often much slower and quite complicated to get right, particularly if you’re doing cross platform development. Linux only isn’t bad. Windows only isn’t bad. But Linux + Windows (and anything else) is bad. And compile times can be strangely bad, and complex to speed up. (A project I worked on 10+ years ago had 14 hour C++ builds on Solaris/Sparc, for example. That’s just not right.)
Finding good C++ programmers is hard. And almost all good C++ programmers are very expensive, if you’re lucky enough to find them at all. And a bad C++ programmer will often do huge harm to an entire project, while a bad (for example) Python developer will tend to only shit in his own lunchbox.
I think the “death by 1000 cuts” analogy isn’t wrong. But it might only be 87 cuts, or something like that. We found that we could systematize a lot of the things necessary to make a C++ project run well, but the list was immense (and the items on the list more complex) compared to what we needed to do in Java, or C#, etc.
This depends a lot on your baseline. It’s easier to find a good C++ programmer than a Rust programmer of any skill level. Over the last 5 years, it’s become easier to find good C++ programmers than good C programmers. It’s orders of magnitude easier to find a good Java, C#, or JavaScript programmer than a good C++ programmer and noticeably easier than finding C++ programmers of any competence level.
Embedded software, operating systems, realtime buses, audio and video applications…
Yep! In other words, almost all the things I’m most interested in!
So, no matter how you look at it, GC is a bad deal.
…Ok I gotta call you out there. :P There’s plenty of times when a GC is a perfectly fine and/or great deal. The problem is just that when you don’t want a GC, you really don’t want a GC, and most languages with a GC use it as a way to make simplifying assumptions that have not stood the test of time. I think a bright future exists for languages like Swift, which use a GC or refcounting and have a good ownership system to let the compiler optimize the bits that don’t need it.
It’s a bad deal you can sometimes afford to take when you have lots of CPU cycles and RAM to spare ;-)
Don’t get me wrong, I’m open to using any tool as long as it gets the job done reliably. I wouldn’t want to manage memory when writing shell scripts. On the other hand, the use-case for shell scripts is very narrow, I wouldn’t use them for most things. The larger the project, the more of a liability GC becomes.
It’s a bad deal you can sometimes afford to take when you have lots of CPU cycles and RAM to spare ;-)
It’s not always that clear cut. Sometimes the performance gains from being able to easily use cyclic data structures that model your problem domain and lead to efficient algorithms can significantly outweigh the GC cost.
Ok, fair. :-) Hmmmm though, I actually thought of a use case where GC of some form or another seems almost inevitable: dealing with threads and/or coroutines that have complex/dynamic lifetimes. These situations can sometimes be avoided, but sometimes not, especially for long-running things. Even in Rust it’s pretty common to deal with them via “fiiiiiiine just throw the shared data into an Rc”.
Also, since killing threads is so cursed on just about every operating system as far as I can tell, a tracing GC has an advantage there in that it can always clean up a dead thread’s resources, sooner or later. One could argue that a better solution would be to have operating systems be better at cleaning up threads, but alas, it’s not an easy problem.
Am I missing anything? I am still a novice with actually sophisticated threading stuff.
dealing with threads and/or coroutines that have complex/dynamic lifetimes
The more code I write, the more I feel that having a strong hierarchy with clearly defined lifetimes and ownership is a good thing. Maybe I’m developing C++ Stockholm syndrome, but I find myself drawn to these simpler architectures even when using other languages that don’t force me to. About your point with Rc, I don’t think this qualifies as a garbage collector because you don’t delegate the cleanup to some runtime, you still delete the object inside the scope of one of your own functions (i.e. the last scope that drops the object) and thus on the time budget of your own code. Additionally, often just a few key objects/structs need to be wrapped in a std::shared_ptr or Rc, so the overhead is negligible.
Also, since killing threads is so cursed on just about every operating system as far as I can tell
Threads are supposed to be joined cooperatively, not killed (canceled). At the point of being canceled, the thread might be in any state, including in a critical section of a Mutex. This will almost certainly lead to problems down the road. But even joining threads is cursed because people do stuff like sleep(3), completely stalling the thread, which makes it impossible to terminate the thread cooperatively within a reasonable time frame. The proper way for threads to wait is to wait on the thing you want to wait on plus additionally a cancellation event which would be triggered if the thread needs to be joined. So you wait on two things at the same time (also see select and epoll). It’s not so much the OS that is the problem (though the OS doesn’t help because it doesn’t provide simple to use, good primitives) but the programmer. Threads should clean up their own state upon being joined. The owner of the thread, the one who called join (usually the main thread) will clean up potential remains like entries in thread lists. There should never be an ownerless thread. Threads must be able to release their resources and be stopped in a timely manner anyway, for example when the system shuts down, the process is stopped or submodules are detached. Here, a garbage collector does not provide much help.
The Go projects I’m (somewhat) involved with still very much have GC related performance issues. Big-data server stuff. Recent releases of Go have helped, though.
Is this a provocative thesis? “There are two types of ${tool} users: people who chose ${tool} because they liked various of its properties, or people who used ${tool} because it was their best or only option at the time” - seems true for pretty much any tool?
[Comment removed by author]
And the iron law applies; the folks who refine C and evolve it over time are all fans of C, rather than merely tolerating it as one possible tool in a design space.
So so true. I spent about 10 years of my life trying to find something better than C for programming stuff that needed strong control over memory. You can do it in C#, Lisp, etc but it requires incredibly detailed knowledge of the implementation.
C++? The adoption/learning curve is so shallow — for instance you can keep writing C code but just use (zone or more of) std::string, std::vector and std::unique_ptr; and most of your memory management code and its bugs go away. And of course use new/delete/malloc/free if you really need to.
After writing an OS in C++, I really hate having to go back to C. I have to write far more code in C, and (worse) I have to think about the same things all of the time. Even just having RAII saves me a huge amount of effort (smart pointers are a big part, but so is having locks released at the end of a scope). For systems code, data structure choice is critical and C++ makes it so much easier to prototype with one thing, profile usage patterns, and then replace it with something else later.
Would you think it would be helpful to have a C w/ Lisp-style metaprogramming to implement higher-level constructs that compile down to C? And then you use what you’re willing to pay for or put up with?
One that got into semi-usable form was ZL Language which hints at many possibilities in C, C++, and Lisp/Scheme.
Since you wanted destructors, I also found smart pointers for C whose quality I couldn’t evaluate because I don’t program in C. It looked readable at least. There’s been many implementations of OOP patterns in C, too. I don’t have a list of them but many are on StackOverflow.
Or you could just use a language that’s widely supported by multiple compilers and has these features. I implemented a bunch of these things in C, but there were always corner cases where they didn’t work, or where they required compiler-specific extensions that made them hard to port. Eventually I realised I was just implementing a bad version of C++.
In the example that you link to, for example, it uses the same attribute that I’ve used for RAII locks and for lexically scoped buffers. From the perspective of gcc, these are just pointers. If you assign the value to another pointer-type variable, it will not raise an error and will give you a dangling pointer. Without the ability to overload assignment (and, ideally, move), you can’t implement these things robustly.
C++ metaprogramming got a lot better with constexpr and the ability to use structural types as template arguments. The only thing that it lacks that I want is the ability to generate top-level declarations with user-defined names from templates.
Can you elaborate on this? I wonder if it’s related to a problem I have at work.
We have a lot of
std::variant
-like types, likeusing BoolExpr = std::variant<Atom, Conjunction, Disjunction>;
. But the concise nameBoolExpr
is only an alias: the actual symbol names use the fullstd::variant<Atom, Conjunction, Disjunction>
. Some of these variants have dozens of cases, so any related function/method names get reaallly long!I think I would want a language feature like “the true name of
std::variant<Atom, Conjunction, Disjunction>
isBoolExpr
”. Maybe this would be related to explicit template instantiation: you could declare this inbool_expr.h
and it would be an error to instantiatestd::variant<Atom, Conjunction, Disjunction>
anywhere else.The main thing for me is exposing things to C. I can use X macros to create a load of variants of a function that use a name and a type in their instantiations, but I can’t do that with templates alone. Similarly, I can create explicit template instantiations in a file (so that they can be extern in the header) individually, but I can’t write a template that declares extern templates for a given template over a set of types and another that generates the code for them in my module.
The reflection working group has a bunch of proposals to address these things and I’ve been expecting them to make it into the next standard since C++17 was released. Maybe C++26…
My motivation was this. At one point, I was also considering embedded targets which only support assembly and C variants.
What do you think of my Brute-Force Assurance concept that reuses rare, expensive investments in tooling across languages?
I think platforms without C++ support are dying out. Adding a back end for your target to LLVM is cheaper than writing a C compiler and so there’s little incentive to not support C++ (and Rust). The BFA model might work, but I’d have to see the quality of the code that it generated. Often these tools end up triggering UB, which is a problem, or leave the kind of microoptimisations that are critical to embedded systems out and impossible to add in the generated code.
Makes sense. Fortunately, there is more work happening for LLVM targets. Thanks for the review!
a) this started in 2002 when half that stuff didn’t exist, and
b) C++ is a hateful morass of bullshit and misdesign, and that won’t change until they start removing things instead of adding it.
Yes, I am biased. Not going to change though.
Pretty sure at least string and vector existed in 2002; not unique_ptr but you can implement that yourself in 10 minutes.
You couldn’t implement unique_ptr in 2002 with the semantics that it has today. unique_ptr requires language support for move semantics in order to give you that uniqueness promise automatically and move semantics came to C++ in 2011.
Not just that - you actively need to work around the problems and limitations of the runtime. E.g. when garbage collection bogs your application down, you need to start creating object pools. Hence, you end up manually managing memory again - precisely the thing you tried to avoid in the first place. Many runtimes do not let you run the garbage collector manually or specify fine-granular garbage collection settings. In addition to that, an update to the runtime (which you often do not control because it’s just the runtime that is installed on the user’s machine) can ruin all your memory optimizations again and send you back to square one, which is a heavy maintenance burden. It just doesn’t make any sense to use these languages for anything that requires fine-grained control over the execution. Frankly, it doesn’t make any sense to use these languages at all if you know C++ or Rust, unless the platform forces you to use then (like the web pretty much forces you to use JavaScript if you want to write code that is compatible with most browsers).
It’s been a long time (over a decade) since I had to deal with GC problems being noticeable at an application level. A lot of these problems disappeared on their own, as computers became faster and GC algorithms moved from the drawing boards into the data center. (I was working on software for trading systems, algo trading, stock market engines, operational data stores, and large scale distributed systems. Mostly written in Java.)
In the late 90s, GC pauses were horrendous. By the late aughts, GC pauses were mostly manageable, and we had had enough time to work around the worst causes of them. Nowadays, pauseless GC algorithms are becoming the norm.
I still work in C and C++ when necessary, just like I still repair junk around the house by hand. It’s possible. Sure, it would be far cheaper and faster to order something brand new from China, but there’s a certain joy in wasting a weekend trying to do what should be a 5 minute fix. Similarly, it’s sometimes interesting to spend 40 person years (e.g. a team of 10 over a 4 year period) on a software project in C++ that would take a team of 5 people maybe 3 months to do in Go. Of course, there are still a handful of projects that actually need to be built in C or C++ (or Rust or Zig, I guess), but that is such a tiny portion of the software industry at this point, and those people already know who they are and why they have to do what they have to do.
You said “It just doesn’t make any sense to use these languages for anything that requires fine-grained control over the execution.” But how many applications still require that level of fine grained control?
For literally decades, people have been saying that GC has now improved so much that it’s become unnoticeable, and every single time I return to try it, I encounter uncontrollable, erratic runtime behavior and poor performance. Unless you write some quick and dirty toy program to plot 100 points, you will notice it one way or another. Try writing a game in JavaScript - you still have to do object pooling. Or look at Minecraft - the amount of memory the JVM allocates and then frees during garbage collection is crazy. Show me a garbage collector and I’ll show you a nasty corner case where it breaks down.
Okay, I’m not a big C++ fan but this is obviously flamebait. Not even gonna comment on it further.
A lot. Embedded software, operating systems, realtime buses, audio and video applications… Frankly, I have a hard time coming up with something I worked on that doesn’t require it. Not to mention, even if the application doesn’t strictly require it, a GC is still intrinsically wasteful, making the software run worse, especially on weaker machines. And even if we say performance doesn’t matter, using languages with GC encourages bad and convoluted design and incoherent lifetime management. So, no matter how you look at it, GC is a bad deal.
I managed a large engineering organization at BigTechCoInc for a number of years, and kept track (as closely as possible) of technical projects and what languages they used and what results they had. Among other languages we used in quantity: C, C++, Java, C#. (Other languages too including both Python and JS on the back end, but not enough to draw any clear conclusions.) The cost per delivered function point was super high in C++ compared to everything else (including C). C tended to be cheaper than C++ because it seemd to be used mostly for smaller projects, or (I believe) on more mature code bases making incremental changes; I think if we tried building something new and huge in C, it may have been as expensive as the C++ projects, but that never happened. Java and C# are very similar languages, and had very similar cost levels, much lower than C or C++, and while I didn’t run any Go projects, I have heard from peers that Go costs significantly less than Java for development (but I don’t know about long term maintenance costs). One project I managed was implemented nearly simultaneously in C++, C#, and Java, which was quite illuminating. I also compared notes with peers at Amazon, Facebook, Google, Twitter, Microsoft, eBay, NYSE (etc.), and lots of different financial services firms, and their anecdotal results were all reasonably similar to mine. The two largest code bases for us were Java and C++, and the cost with C++ was an order of magnitude greater than Java.
Sure. Like I said: “Of course, there are still a handful of projects that actually need to be built in C or C++ (or Rust or Zig, I guess), but that is such a tiny portion of the software industry at this point, and those people already know who they are and why they have to do what they have to do.”
This is absolutely true. The fact that Java works at all is a freaking miracle. The fact that it manages not to fall over with sustained allocation rates of gigabytes per second (mostly all tiny objects, too!) is amazing. That Minecraft succeeded is a bit shocking in retrospect.
Very interesting. Do you have more fine-grained knowledge about the cost per delivered function point with respect to C++? Is the additional cost caused by debugging crashes, memory leaks, etc.? Is it caused by additional training and learning or tooling and build systems? Does the usage of modern C++ idioms make a difference? Or does everything simply take longer, death by a thousand cuts?
Some more data points occurred to me. I was thinking about an old presentation I did at a few different conferences on the topic, e.g. https://www.infoq.com/presentations/Keynote-Lessons-Java-CPlusPlus-History-Cloud/
Specifically, looking at areas that Java was able to leverage:
My thinking has evolved in the subsequent decade, but there are some key things in that list that really show the pain points in C++, specifically around the difficulty of re-using libraries and components. But the other thing that’s important to keep in mind is that the form of applications has changed dramatically over time: An app used to be a file (.bin .com .exe whatever). Then it was a small set of files (some .so or .dll files and some data files in addition to the executable). And at some point, the libraries went from being 1% of the app to 99% of the app.
Just like Java/C# ate C++’s lunch in the “Internet application” era, some “newer” platforms (the modern browser plus the phone OSs) show how ill equipped Java/C# are, although I think that stuff like React and Node (JS) are just interim steps (impressive but poorly thought out) toward an inevitable shift in how we think about applications.
Anyhow, it’s a very interesting topic, and I wish I had more time to devote to thinking about this kind of topic than just doing the day job, but that’s life.
I’m going to go into opinion / editorial mode now, so please discount accordingly.
C++ isn’t one language. It’s lots of different languages under one umbrella name. While it’s super powerful, and can do literally everything, that lack of “one true way to do everything” really seems to hurt it in larger teams, because within a large team, no subgroups end up using the same exact language.
C++ libraries are nowhere near as mature (either in the libraries themselves, or in the ease of using them randomly in a project) as in other languages. It’s very common in other languages to drag in different libraries arbitrarily as necessary, and you don’t generally have to worry about them conflicting somehow (even though I guess they might occasionally conflict). In C++, you generally get burnt so badly by trying to use any library other than boost that you never try again. So then you end up having to build everything from scratch, on every project.
Tooling (including builds) is often much slower and quite complicated to get right, particularly if you’re doing cross platform development. Linux only isn’t bad. Windows only isn’t bad. But Linux + Windows (and anything else) is bad. And compile times can be strangely bad, and complex to speed up. (A project I worked on 10+ years ago had 14 hour C++ builds on Solaris/Sparc, for example. That’s just not right.)
Finding good C++ programmers is hard. And almost all good C++ programmers are very expensive, if you’re lucky enough to find them at all. And a bad C++ programmer will often do huge harm to an entire project, while a bad (for example) Python developer will tend to only shit in his own lunchbox.
I think the “death by 1000 cuts” analogy isn’t wrong. But it might only be 87 cuts, or something like that. We found that we could systematize a lot of the things necessary to make a C++ project run well, but the list was immense (and the items on the list more complex) compared to what we needed to do in Java, or C#, etc.
This depends a lot on your baseline. It’s easier to find a good C++ programmer than a Rust programmer of any skill level. Over the last 5 years, it’s become easier to find good C++ programmers than good C programmers. It’s orders of magnitude easier to find a good Java, C#, or JavaScript programmer than a good C++ programmer and noticeably easier than finding C++ programmers of any competence level.
Yep! In other words, almost all the things I’m most interested in!
…Ok I gotta call you out there. :P There’s plenty of times when a GC is a perfectly fine and/or great deal. The problem is just that when you don’t want a GC, you really don’t want a GC, and most languages with a GC use it as a way to make simplifying assumptions that have not stood the test of time. I think a bright future exists for languages like Swift, which use a GC or refcounting and have a good ownership system to let the compiler optimize the bits that don’t need it.
It’s a bad deal you can sometimes afford to take when you have lots of CPU cycles and RAM to spare ;-)
Don’t get me wrong, I’m open to using any tool as long as it gets the job done reliably. I wouldn’t want to manage memory when writing shell scripts. On the other hand, the use-case for shell scripts is very narrow, I wouldn’t use them for most things. The larger the project, the more of a liability GC becomes.
It’s not always that clear cut. Sometimes the performance gains from being able to easily use cyclic data structures that model your problem domain and lead to efficient algorithms can significantly outweigh the GC cost.
Ok, fair. :-) Hmmmm though, I actually thought of a use case where GC of some form or another seems almost inevitable: dealing with threads and/or coroutines that have complex/dynamic lifetimes. These situations can sometimes be avoided, but sometimes not, especially for long-running things. Even in Rust it’s pretty common to deal with them via “fiiiiiiine just throw the shared data into an
Rc
”.Also, since killing threads is so cursed on just about every operating system as far as I can tell, a tracing GC has an advantage there in that it can always clean up a dead thread’s resources, sooner or later. One could argue that a better solution would be to have operating systems be better at cleaning up threads, but alas, it’s not an easy problem.
Am I missing anything? I am still a novice with actually sophisticated threading stuff.
The more code I write, the more I feel that having a strong hierarchy with clearly defined lifetimes and ownership is a good thing. Maybe I’m developing C++ Stockholm syndrome, but I find myself drawn to these simpler architectures even when using other languages that don’t force me to. About your point with
Rc
, I don’t think this qualifies as a garbage collector because you don’t delegate the cleanup to some runtime, you still delete the object inside the scope of one of your own functions (i.e. the last scope that drops the object) and thus on the time budget of your own code. Additionally, often just a few key objects/structs need to be wrapped in astd::shared_ptr
orRc
, so the overhead is negligible.Threads are supposed to be joined cooperatively, not killed (canceled). At the point of being canceled, the thread might be in any state, including in a critical section of a Mutex. This will almost certainly lead to problems down the road. But even joining threads is cursed because people do stuff like
sleep(3)
, completely stalling the thread, which makes it impossible to terminate the thread cooperatively within a reasonable time frame. The proper way for threads to wait is to wait on the thing you want to wait on plus additionally a cancellation event which would be triggered if the thread needs to be joined. So you wait on two things at the same time (also seeselect
andepoll
). It’s not so much the OS that is the problem (though the OS doesn’t help because it doesn’t provide simple to use, good primitives) but the programmer. Threads should clean up their own state upon being joined. The owner of the thread, the one who called join (usually the main thread) will clean up potential remains like entries in thread lists. There should never be an ownerless thread. Threads must be able to release their resources and be stopped in a timely manner anyway, for example when the system shuts down, the process is stopped or submodules are detached. Here, a garbage collector does not provide much help.The Go projects I’m (somewhat) involved with still very much have GC related performance issues. Big-data server stuff. Recent releases of Go have helped, though.