It also seems to me that C++ -> Carbon seems very similar to C++ -> D in spirit. But quality of implementation matters as well: the solution is occasionally to do the same thing, but better.
Looking closer, it seems that Carbon is sufficiently different from D in details. I would say that D tried to be the union of Go&Carbon, while those two separated the goals of
high-level Java-like language with simpler, more predictable, more memory cautious, and more deployable execution model
C++ compatible language which mostly just cuts the cruft
There was an experiment with D doing this same approach: https://wiki.dlang.org/Calypso though despite looking promising, its author decided to not follow through.
At least its generics are type checked - one of the more painful parts of D is the template expansion errors. I really don’t really like how they are kicking the memory safety issue down the road though. :/
Carbon can directly include C++ header files. The D documentation suggests this feature will never be added to D:
Being 100% compatible with C++ means more or less adding a fully functional C++ compiler front end to D. Anecdotal evidence suggests that writing such is a minimum of a 10 man-year project, essentially making a D compiler with such capability unimplementable.
Given this, how does Google contribute the feature to D? Sounds like it’s impossible in the current D project, so it would require reimplementing D from scratch with new features the original project can’t match? Embrace and extend, a hostile takeover by Google?
I’ve used D for a number of years and I would not recommend it to anyone. I recently started a new project and reverted back to C++ as D has so many problems that just don’t exist in other languages. You are going to come across bugs all the time, for everything, including interpreting with C++ as the ABI is so complex it isn’t followed correctly. So you can spend hours trying to figure out what the issue is, when you get down to debugging the assembly you realize the compiler is doing something wrong.
There’s no “versioning”. Every release of the frontend is essentially it’s own separate language, meaning your code can stop working between releases (which happens regularly for my D project).
There’s no idiomatic way to do memory safety without the GC. You can’t even create generic containers that are memory safe because of the way D handles copying and moving. Their standard library has no generic containers like you would expect in basically every other language.
It inherits C semantics so it carries baggage that causes problems just so you can copy and paste C code to convert it to D.
Leadership has no concrete plan or goal, that leads to features being added like ImportC that just complicates the language while not even doing it correctly. You can “ImportC” code but you have to use a third party preprocessor. It makes no sense, and now they have a C compiler to maintain (across 3 different D compilers) on top of that.
The frontend of D, that all the compilers use, uses the GC but never collects (by default). It is a huge memory hog and can easily eat 10+ GB. Especially if you use templates (which almost everything does).
tldr; D is a good toy language for small scripts but I wouldn’t use it in a real project as it causes more headaches than it solves.
Which would make C++ still be a second-class citizen in the D ecosystem. Nothing beats having C++ libraries “just work”, the same way that C libraries have mostly “just worked” in C++. And it wouldn’t make the interoperability bidirectional; I don’t know how easy it is to make D libraries which can be used from C++, but I’d guess it’s not as seamless as the Carbon people want it to be.
Also, doesn’t D practically require a garbage collector? A garbage collected language will be very different from C++.
The README says that “perfect forward compatibility” is not guaranteed. Note the word “perfect”.
In practice, C++ doesn’t have perfect forward compatibility either, and Rust explicitly doesn’t guarantee perfect forward compatibility. One of the consequences of “perfect forward compatibility” is that you can’t introduce new reserved words, which has happened in both C++ and in Rust.
That’s a good point. Personally, I am ready to use a language that offers better performance than C++, but is C++ interoperable at the source level, in exchange for periodic ABI compatibility breaks. If you know this is part of the social contract for the language, you can plan for it.
The project goals
mention Performance as a rationale for occasional ABI compatibility breaks
and Language Evolution, the ability to correct mistakes in the language design, as a rationale for
not guaranteeing perfect forward compatibility at the language level.
I was surprised when Titus said that backwards ABI compatibility was a constraint that the C++ standards impose. I’ve never heard of anyone trying to support a C++ ABI across language versions let alone compiler versions. I’ve always seen people write a C ABI if they want a stable binary interface. But I’m led to believe that in some other parts of the industry (maybe games?) it’s common to receive shared-object + headers C++ libraries.
I’ve never heard of anyone trying to support a C++ ABI across language versions let alone compiler versions.
Both GCC / libstdc++ and Clang / libc++ provide strong C++ ABI guarantees across compiler versions. C++ code compiled with GCC 4.0 and the headers from the bundled libstdc++ will happily link against code compiled with clang 15 and linked to a modern libstdc++. I think this guarantee goes back to GCC 3.0, but I’m not completely sure.
Visual Studio treats the C++ ABI as unstable, but provides COM as an opt-in stable ABI for a subset of C++ (no templates).
They compare it to TypeScript, but they aren’t doing things that made TypeScript so successful: being close to a syntax superset and compiling to the language it replaces.
If the goal is to gradually migrate from C++, the gratuitous syntax and nomenclature changes don’t provide any value to C++ developers. They’re just a bikeshed-level annoyance for people working with both languages at the same time (e.g. the term “array” is now even more ambiguous than it was in C++).
You can paste JavaScript into a TypeScript file and vice versa with only small fixups, but you can’t do the same between Carbon and C++ even for tiny things.
Being tied to LLVM will be the same hindrance that it’s been for Rust. Even people who don’t actually need GCC and exotic platforms use this as a reason to avoid Rust just in case they’ll ever need more than a single LLVM implementation.
A plausible counter-example here is Kotlin – you could say it gratuitously changes Java’s syntax, and yet it’s quite successful. Amusingly, “You can paste JavaScript into a TypeScript file and vice versa with only small fixups,” works for Kotlin because the IDE just transparently inter-converts Java/Kotlin on paste. And my gut feeling that that’s the UX Carbon folks are aiming at.
And that makes sense – from what I understand, making a tooling-friendly language is one of their goals. And that’s why they have to change C++ syntax things like lexer hack and impossibility to parse files in isolation in parallel do make the tooling pretty painfully.
In other words, JS (esp after ES6) already has quite nice syntax, it only lacks a type system, which you can add. With C++ though, you have a problematic syntax, and you want to actively remove stuff from a better language.
They compare it to TypeScript, but they aren’t doing things that made TypeScript so successful: being close to a syntax superset and compiling to the language it replaces.
I think the second of these matters only in the context of the JavaScript ecosystem. If browsers had first-class TypeScript support then there would be much less reason to convert to JavaScript, but if you want a language that interoperates closely with JavaScript in a browser (or in Node.js or similar) then it must compile to JavaScript. C++, in contrast, lives in an ecosystem with a cross-language linkage model and so does not have this problem.
Based on my experience with LLVM-Rust it’s doable, but not great. You need to jump through extra hoops to get cross-language LTO and PGO, and these are possible only if you use clang. It can be an issue for projects using MSVC and GCC.
A new compiler is also a hassle if you’re doing game development, because consoles have their proprietary toolkits with a bunch of quirks in their OS/libc/linking, but aggressive NDAs get in the way of supporting them properly in the publicly-available compiler.
A compile-to-C++ language could be a drop-in replacement. With a new custom compiler there’s extra friction. It’s not impossible to overcome, but if you’re asking people to switch over, removing friction is important.
Lots of companies have massive bases of C++, agglomerated over decades. I’ve seen code bases that are hundreds of millions of lines. There’s a lot of demand for a modern language that can fully interoperate with that legacy code.
We’ll see if Carbon can actually achieve that, and if companies are willing to adopt it.
Why we switched order from Type var_name to var_name Type?
Why did we add a :: var_name: Type?
Why for return type we went with -> Type rather than : Type?
The first one is I think primarily due to type inference: if you want to make the Type optional, it’s easier to parse if the type comes second: most production parsers are left-to-right, and its easier to eat varname and decide between type or nothing, than to eat something and decide wherther this is a name (pattern) or a type. To give a concrete example, in a language with tuples with a hypothetical syntax where types comes first, in let (T1, T2, T2) (x, y, z) you’d need unbounded lookahead to decide where T1 should be parsed as type or as tuple.
The third one is I think the most week decision here: you could use : instead of an arrow, and, eg, Kotlin does this. The arrow comes from mathematical notation for functions, and, more directly, from ML-family of languages. Even in Kotlin, function type is spelled as (Int, Int) -> Int, with an arrow, its only for fn declarations when the : is used fun f(x: Int, y: Int): Int.
The second one is interesting! Indeed, why not just fn hypot(x u32, y u32) u32? I don’t know, but I have a couple of guesses / rationalizations:
I would guess that this syntax comes from ML, and in ML x y already means function application, so you need some sort of an explicit sigil to denote that y is a type, not expression
In C-style languages, I think parsing var_name Type with optional type would work, but the : makes it easier. “do I see a : after var_name?” is a much more specific condition for parser than “is whatever I am seeing a type?”, as types have complicated grammar, and can begin with variaty of symbols. Again, because C-style languages don’t use space to mean application, I think this’d still be unambigious, just more annoying to parse.
: creates syntactic redundancy which makes it easier error-correct. in fn f(x y z) with a missing comma its hard to say who’s a type and who’s a pattern, : makes that easier fn f(x y:z).
I also think the arrow comes from ML and functional langues, but that doesn’t really answer the question. I’ve been using quite a few languages using this syntax and indeed do use Kotlin and Swift, which both use this syntax, but I am still unsure as of why, other than considering it basically a fashion trend.
The type order part makes sense to me. Quite a few languages made that switch. Also it doesn’t add any redundancy.
On the third: While the other points make sense, this one seems a bit like an excuse:
: creates syntactic redundancy which makes it easier error-correct. in fn f(x y z) with a missing comma its hard to say who’s a type and who’s a pattern, : makes that easier fn f(x y:z).
Couldn’t you say the same about missing colon and what if you switch them? Isn’t the important thing to return an error here and bail out anyways? After all we have two tokens after which we expect a comma.
So it seems to boil down to easier parsing. Thanks for sharing your thoughts. I have to admit, I am not really a fan of the trade (simpler parser for more typing and line noise) made, but then seems people are happy with it, or else it wouldn’t be so popular.
Rust is not C++ compatible. The main feature of this language is that you can link to C++ libraries (hopefully with fewer of the problems of straight C++).
It would be interesting if they added Rust compatibility as well, so that you can write a multi-lingual application using both C++ and Rust libraries. I would be quite interested, since there are best of breed libraries in both C++ and Rust that are not duplicated in the other ecosystem.
I’m really hopeful this experiment goes well. I use c++ all the time but really want a cleaned up version of the languages that is willing to delete old stuff and improve.
Where does it say that this is a Google project? I don’t doubt that it is, but I looked through the README and FAQ and found only one (ambiguous) mention of Google…
This pretty much seems to be D. I think Google should contribute to D instead of starting a new language.
It also seems to me that C++ -> Carbon seems very similar to C++ -> D in spirit. But quality of implementation matters as well: the solution is occasionally to do the same thing, but better.
Looking closer, it seems that Carbon is sufficiently different from D in details. I would say that D tried to be the union of Go&Carbon, while those two separated the goals of
It seems to me like I could just import and link any C++ library to use it from carbon, which is currently not so seamless with D.
There was an experiment with D doing this same approach: https://wiki.dlang.org/Calypso though despite looking promising, its author decided to not follow through.
At least its generics are type checked - one of the more painful parts of D is the template expansion errors. I really don’t really like how they are kicking the memory safety issue down the road though. :/
Carbon can directly include C++ header files. The D documentation suggests this feature will never be added to D:
Given this, how does Google contribute the feature to D? Sounds like it’s impossible in the current D project, so it would require reimplementing D from scratch with new features the original project can’t match? Embrace and extend, a hostile takeover by Google?
If this is the only consideration, Google should write a binding generator that reads C++ headers and writes D sources with extern C++ declarations.
I don’t think it’s the only consideration. Here’s a quote from @skyline131313 at https://github.com/carbon-language/carbon-lang/discussions/1448 discussing whether Carbon should use D.
I’ve used D for a number of years and I would not recommend it to anyone. I recently started a new project and reverted back to C++ as D has so many problems that just don’t exist in other languages. You are going to come across bugs all the time, for everything, including interpreting with C++ as the ABI is so complex it isn’t followed correctly. So you can spend hours trying to figure out what the issue is, when you get down to debugging the assembly you realize the compiler is doing something wrong.
There’s no “versioning”. Every release of the frontend is essentially it’s own separate language, meaning your code can stop working between releases (which happens regularly for my D project).
There’s no idiomatic way to do memory safety without the GC. You can’t even create generic containers that are memory safe because of the way D handles copying and moving. Their standard library has no generic containers like you would expect in basically every other language.
It inherits C semantics so it carries baggage that causes problems just so you can copy and paste C code to convert it to D.
Leadership has no concrete plan or goal, that leads to features being added like ImportC that just complicates the language while not even doing it correctly. You can “ImportC” code but you have to use a third party preprocessor. It makes no sense, and now they have a C compiler to maintain (across 3 different D compilers) on top of that.
The frontend of D, that all the compilers use, uses the GC but never collects (by default). It is a huge memory hog and can easily eat 10+ GB. Especially if you use templates (which almost everything does).
tldr; D is a good toy language for small scripts but I wouldn’t use it in a real project as it causes more headaches than it solves.
Which would make C++ still be a second-class citizen in the D ecosystem. Nothing beats having C++ libraries “just work”, the same way that C libraries have mostly “just worked” in C++. And it wouldn’t make the interoperability bidirectional; I don’t know how easy it is to make D libraries which can be used from C++, but I’d guess it’s not as seamless as the Carbon people want it to be.
Also, doesn’t D practically require a garbage collector? A garbage collected language will be very different from C++.
No forward compatibility guarantees means it is a dire mistake for any company other than Google to start writing code in this language.
Also, I highly doubt any company other than Google has the problems this is solving.
The README says that “perfect forward compatibility” is not guaranteed. Note the word “perfect”.
In practice, C++ doesn’t have perfect forward compatibility either, and Rust explicitly doesn’t guarantee perfect forward compatibility. One of the consequences of “perfect forward compatibility” is that you can’t introduce new reserved words, which has happened in both C++ and in Rust.
I think this is where this kind of thinking comes from: https://mobile.twitter.com/tituswinters/status/1188455260702027776
That’s a good point. Personally, I am ready to use a language that offers better performance than C++, but is C++ interoperable at the source level, in exchange for periodic ABI compatibility breaks. If you know this is part of the social contract for the language, you can plan for it.
The project goals mention Performance as a rationale for occasional ABI compatibility breaks and Language Evolution, the ability to correct mistakes in the language design, as a rationale for not guaranteeing perfect forward compatibility at the language level.
I was surprised when Titus said that backwards ABI compatibility was a constraint that the C++ standards impose. I’ve never heard of anyone trying to support a C++ ABI across language versions let alone compiler versions. I’ve always seen people write a C ABI if they want a stable binary interface. But I’m led to believe that in some other parts of the industry (maybe games?) it’s common to receive shared-object + headers C++ libraries.
Both GCC / libstdc++ and Clang / libc++ provide strong C++ ABI guarantees across compiler versions. C++ code compiled with GCC 4.0 and the headers from the bundled libstdc++ will happily link against code compiled with clang 15 and linked to a modern libstdc++. I think this guarantee goes back to GCC 3.0, but I’m not completely sure.
Visual Studio treats the C++ ABI as unstable, but provides COM as an opt-in stable ABI for a subset of C++ (no templates).
They compare it to TypeScript, but they aren’t doing things that made TypeScript so successful: being close to a syntax superset and compiling to the language it replaces.
If the goal is to gradually migrate from C++, the gratuitous syntax and nomenclature changes don’t provide any value to C++ developers. They’re just a bikeshed-level annoyance for people working with both languages at the same time (e.g. the term “array” is now even more ambiguous than it was in C++).
You can paste JavaScript into a TypeScript file and vice versa with only small fixups, but you can’t do the same between Carbon and C++ even for tiny things.
Being tied to LLVM will be the same hindrance that it’s been for Rust. Even people who don’t actually need GCC and exotic platforms use this as a reason to avoid Rust just in case they’ll ever need more than a single LLVM implementation.
A plausible counter-example here is Kotlin – you could say it gratuitously changes Java’s syntax, and yet it’s quite successful. Amusingly, “You can paste JavaScript into a TypeScript file and vice versa with only small fixups,” works for Kotlin because the IDE just transparently inter-converts Java/Kotlin on paste. And my gut feeling that that’s the UX Carbon folks are aiming at.
And that makes sense – from what I understand, making a tooling-friendly language is one of their goals. And that’s why they have to change C++ syntax things like lexer hack and impossibility to parse files in isolation in parallel do make the tooling pretty painfully.
In other words, JS (esp after ES6) already has quite nice syntax, it only lacks a type system, which you can add. With C++ though, you have a problematic syntax, and you want to actively remove stuff from a better language.
I think the second of these matters only in the context of the JavaScript ecosystem. If browsers had first-class TypeScript support then there would be much less reason to convert to JavaScript, but if you want a language that interoperates closely with JavaScript in a browser (or in Node.js or similar) then it must compile to JavaScript. C++, in contrast, lives in an ecosystem with a cross-language linkage model and so does not have this problem.
Based on my experience with LLVM-Rust it’s doable, but not great. You need to jump through extra hoops to get cross-language LTO and PGO, and these are possible only if you use
clang
. It can be an issue for projects using MSVC and GCC.A new compiler is also a hassle if you’re doing game development, because consoles have their proprietary toolkits with a bunch of quirks in their OS/libc/linking, but aggressive NDAs get in the way of supporting them properly in the publicly-available compiler.
A compile-to-C++ language could be a drop-in replacement. With a new custom compiler there’s extra friction. It’s not impossible to overcome, but if you’re asking people to switch over, removing friction is important.
I doubt any of that matters for Google. They use clang internally for everything and are happy to upstream the necessary LTO bits to lld.
Lots of companies have massive bases of C++, agglomerated over decades. I’ve seen code bases that are hundreds of millions of lines. There’s a lot of demand for a modern language that can fully interoperate with that legacy code.
We’ll see if Carbon can actually achieve that, and if companies are willing to adopt it.
This is a bit off-topic, but could someone explain why it became common to use
->
for return types and colons for parameter types?To me it feels like more to type and more line noise, so harder to read and otherwise really redundant. This always had me wonder.
There are free questions here:
Type var_name
tovar_name Type
?:
:var_name: Type
?-> Type
rather than: Type
?The first one is I think primarily due to type inference: if you want to make the
Type
optional, it’s easier to parse if the type comes second: most production parsers are left-to-right, and its easier to eat varname and decide between type or nothing, than to eat something and decide wherther this is a name (pattern) or a type. To give a concrete example, in a language with tuples with a hypothetical syntax where types comes first, inlet (T1, T2, T2) (x, y, z)
you’d need unbounded lookahead to decide whereT1
should be parsed as type or as tuple.The third one is I think the most week decision here: you could use
:
instead of an arrow, and, eg, Kotlin does this. The arrow comes from mathematical notation for functions, and, more directly, from ML-family of languages. Even in Kotlin, function type is spelled as(Int, Int) -> Int
, with an arrow, its only for fn declarations when the:
is usedfun f(x: Int, y: Int): Int
.The second one is interesting! Indeed, why not just
fn hypot(x u32, y u32) u32
? I don’t know, but I have a couple of guesses / rationalizations:x y
already means function application, so you need some sort of an explicit sigil to denote thaty
is a type, not expressionvar_name Type
with optional type would work, but the:
makes it easier. “do I see a:
aftervar_name
?” is a much more specific condition for parser than “is whatever I am seeing a type?”, as types have complicated grammar, and can begin with variaty of symbols. Again, because C-style languages don’t use space to mean application, I think this’d still be unambigious, just more annoying to parse.:
creates syntactic redundancy which makes it easier error-correct. infn f(x y z)
with a missing comma its hard to say who’s a type and who’s a pattern,:
makes that easierfn f(x y:z)
.I also think the arrow comes from ML and functional langues, but that doesn’t really answer the question. I’ve been using quite a few languages using this syntax and indeed do use Kotlin and Swift, which both use this syntax, but I am still unsure as of why, other than considering it basically a fashion trend.
The type order part makes sense to me. Quite a few languages made that switch. Also it doesn’t add any redundancy.
On the third: While the other points make sense, this one seems a bit like an excuse:
Couldn’t you say the same about missing colon and what if you switch them? Isn’t the important thing to return an error here and bail out anyways? After all we have two tokens after which we expect a comma.
So it seems to boil down to easier parsing. Thanks for sharing your thoughts. I have to admit, I am not really a fan of the trade (simpler parser for more typing and line noise) made, but then seems people are happy with it, or else it wouldn’t be so popular.
Thank you for your insights! :)
So rust with classes ?
Rust is not C++ compatible. The main feature of this language is that you can link to C++ libraries (hopefully with fewer of the problems of straight C++).
It would be interesting if they added Rust compatibility as well, so that you can write a multi-lingual application using both C++ and Rust libraries. I would be quite interested, since there are best of breed libraries in both C++ and Rust that are not duplicated in the other ecosystem.
I’m really hopeful this experiment goes well. I use c++ all the time but really want a cleaned up version of the languages that is willing to delete old stuff and improve.
Where does it say that this is a Google project? I don’t doubt that it is, but I looked through the README and FAQ and found only one (ambiguous) mention of Google…
https://news.ycombinator.com/item?id=32152261