I had an unfortunate experience this week regarding the “static compilation” section of this post.
I was browsing twitch looking for coders when I came upon a guy writing a game of life in rust (compiled to wasm).
His computer was quite slow and it was painful watching him recompile with minor changes, genuinely the majority of the time I was watching was spent in various compilations. (Which I thought was weird because rust should be incremental, but somehow the incremental build kept randomly running from near scratch.)
Anyway, from what I understand from the documentation cargo check does all the steps up to but not including the linking.
So I suggested that in chat. The fella made a mock run of cargo check and due to the recompilation of everything it was slower than his incremental full build including linking. So I was labelled a troll and banned from his chat.
Anyway, my point is: I wonder how slow that linking really is, and is it hurting wasm compilation? Maybe I was wrong here.
It is a miracle that optimizing compilers ever terminate at all, and that their resulting code is so amazingly fast. For humans, predicting how to organize their code to find the right balance of compile time and run time is pretty much impossible.
I strongly disagree with this. I find nothing more annoying than an editor or IDE doing this while I’m typing. Of course it doesn’t typecheck, I’m only halfway through writing the line!
There’s a usability balance here. Pre-.NET VB was spectacularly bad at this: if you moved the cursor off a line (for example, to select something to copy and paste, or to fix another line where you’ve just spotted a bug while writing the current one) and the line didn’t fully pass semantic analysis, it would beep, colour the line red, and bring your cursor back. It’s quite common for C to do some fairly simple highlighting using existing token definitions in a line until you reach a semicolon, then send the new text to the LSP for complete processing.
This ‘language server’ crap exists only to paper over how terrible the Rust compiler is. C doesn’t have language servers because why would it be necessary?
Huh? C and C++ both has LSP implementations. The clang-based one is the most popular. C needs something like LSP because the complex integration between the preprocessor and the command-line flags in the compiler invocation result in very different post-tokenisation output, which means that anything short of a full compiler is unable to correctly do even basic syntax highlighting for C (is this token a macro? A local variable? A global? A function name? Good luck figuring that out without running the preprocessor and building a full AST).
A lot of the code for cached prefix headers in Clang exist primarily to support the LSP and similar protocols. A typical C compilation unit has (at least) tens of thousands of lines of code before the first non-preprocessor line in the main file. Last time I looked, the default set of system headers on macOS expanded to around 8MiB of text. Other platforms vary. Parsing all of this every time you want to see if a new line is correct is a very significant overhead.
Using LLVM in the first place was lazy
LLVM is well over a million lines of code at this point. Reimplementing it would be a colossal waste of developer resources unless you expected to get a very significant benefit. It’s well over a hundred engineer-years of effort to get something equivalent. Funding that level of investment requires a very clear expected return.
Relying almost entirely on LLVM’s optimiser was lazy.
No, it was a pragmatic start. Rust has a few things that are different from C in the codegen path, but not that many. LLVM sorts out the low-hanging fruit. That lets you quickly identify the things that are difficult to do after lowering to LLVM IR and add those later.
Very lazily designed, throughout, for lazy programmers that want the compiler to do everything for them because they can’t pay attention to the safety of their code themselves.
Unlike those non-lazy C/C++ programmers, whose memory safety bugs have been responsible for around 70% of CVEs published every single year over the last decade?
And I’d hazard to guess that 99% of C programmers don’t use it.
I mostly just use ctags and grep. ctags is pretty good, except that with multiple definitions are often in the “wrong order” (e.g. the definition I want is item 40 or something). The “find all users of X” problem I solve with grep, but there ideally would be some way to have an index.
especially when you follow the practice of headers not including other headers.
I have never seen a C project which declared all types and functions in each header without including other headers.
I have never seen a C project which declared all types and functions in each header without including other headers.
Plan9 follows this policy.
Realizing the long-term problems with the use of #ifndef guards, the designers of the Plan 9 libraries took a different, non-ANSI-standard approach. In Plan 9, header files were forbidden from containing further #include clauses; all #includes were required to be in the top-level C file.
(source)
I find nothing more annoying than an editor or IDE doing this while I’m typing. Of course it doesn’t typecheck, I’m only halfway through writing the line!
I use a language server to typecheck my Rust code, and I just have it set to not update and display new errors until I save the file.
I had an unfortunate experience this week regarding the “static compilation” section of this post.
I was browsing twitch looking for coders when I came upon a guy writing a game of life in rust (compiled to wasm).
His computer was quite slow and it was painful watching him recompile with minor changes, genuinely the majority of the time I was watching was spent in various compilations. (Which I thought was weird because rust should be incremental, but somehow the incremental build kept randomly running from near scratch.)
Anyway, from what I understand from the documentation
cargo check
does all the steps up to but not including the linking.So I suggested that in chat. The fella made a mock run of cargo check and due to the recompilation of everything it was slower than his incremental full build including linking. So I was labelled a troll and banned from his chat.
Anyway, my point is: I wonder how slow that linking really is, and is it hurting wasm compilation? Maybe I was wrong here.
cargo check only recompiles every dependency the first run. After that it’s fast.
Neat, I never thought about it this way:
[Comment from banned user removed]
There’s a usability balance here. Pre-.NET VB was spectacularly bad at this: if you moved the cursor off a line (for example, to select something to copy and paste, or to fix another line where you’ve just spotted a bug while writing the current one) and the line didn’t fully pass semantic analysis, it would beep, colour the line red, and bring your cursor back. It’s quite common for C to do some fairly simple highlighting using existing token definitions in a line until you reach a semicolon, then send the new text to the LSP for complete processing.
Huh? C and C++ both has LSP implementations. The clang-based one is the most popular. C needs something like LSP because the complex integration between the preprocessor and the command-line flags in the compiler invocation result in very different post-tokenisation output, which means that anything short of a full compiler is unable to correctly do even basic syntax highlighting for C (is this token a macro? A local variable? A global? A function name? Good luck figuring that out without running the preprocessor and building a full AST).
A lot of the code for cached prefix headers in Clang exist primarily to support the LSP and similar protocols. A typical C compilation unit has (at least) tens of thousands of lines of code before the first non-preprocessor line in the main file. Last time I looked, the default set of system headers on macOS expanded to around 8MiB of text. Other platforms vary. Parsing all of this every time you want to see if a new line is correct is a very significant overhead.
LLVM is well over a million lines of code at this point. Reimplementing it would be a colossal waste of developer resources unless you expected to get a very significant benefit. It’s well over a hundred engineer-years of effort to get something equivalent. Funding that level of investment requires a very clear expected return.
No, it was a pragmatic start. Rust has a few things that are different from C in the codegen path, but not that many. LLVM sorts out the low-hanging fruit. That lets you quickly identify the things that are difficult to do after lowering to LLVM IR and add those later.
Unlike those non-lazy C/C++ programmers, whose memory safety bugs have been responsible for around 70% of CVEs published every single year over the last decade?
[Comment from banned user removed]
I mostly just use ctags and grep. ctags is pretty good, except that with multiple definitions are often in the “wrong order” (e.g. the definition I want is item 40 or something). The “find all users of X” problem I solve with grep, but there ideally would be some way to have an index.
I have never seen a C project which declared all types and functions in each header without including other headers.
hahaha
Plan9 follows this policy.
Realizing the long-term problems with the use of #ifndef guards, the designers of the Plan 9 libraries took a different, non-ANSI-standard approach. In Plan 9, header files were forbidden from containing further #include clauses; all #includes were required to be in the top-level C file.
(source)
Interesting. Compilers can often avoid reading headers multiple times. This trick might not be as useful as it once was. However, I have seen some work in this direction on projects I’ve been involved with recently. Although, I think that series is mostly for semantic purposes.
[Comment from banned user removed]
You’ve never done any work in compilers, have you?
Rust doesn’t “need” the LSP either. I don’t use it, and likely won’t.
I use a language server to typecheck my Rust code, and I just have it set to not update and display new errors until I save the file.
Besides this it’s probaly eating a non-negligible part of your CPU time.
And that’s just more battery usage which is less battery life over time too…