I still think they should investigate doing the LISP solution to the problem where you combine several options:
Interpreter (eg REPL) that executes it instantly with dynamic checks for rapid prototyping of logic.
Fast, non-optimizing compile that at least does the safety checks.
Fully-optimizing compile that can be done in background or overnight while programmer works on other things like tests or documentation.
Incremental, per-function version of 2 or 3 for added boost. Rust is already adding something like that.
This combo maximizes mental flow of programmer, lets them tinker around with algorithms, quickly produces executables for test deployments, and will often do full compiles without interrupting flow with wait times. It’s the workflow available in a good LISP. It doesn’t have to be exclusive to LISP. Other languages should copy it. I’ll go further to say they should even design the language and compiler to make it easy to pull that off.
This is the default when cargo build runs.
This won’t work in practice. We should strive as best we can to make optimized builds fast to produce. A significant portion of the work I do in Rust is benchmarking and performance tuning. Given Rust’s target audience, I suspect I’m not alone. I need optimized builds to do that. If the turnaround time on a change is longer than a few minutes even, then this process becomes difficult.
I don’t want to live in a world where the most optimized builds take an entire evening to compile. I’m personally looking forward to incremental compilation.
quickly produces executables for test deployments
Not all tests are appropriate to run on debug builds. If you’re quickchecking a mildly expensive operation for example, then you might want to turn optimizations on for tests. I don’t do it all of the time, but sometimes? Sure.
I don’t want to live in a world where the most optimized builds take an entire evening to compile.
Not saying that. I’m saying full builds of a module in languages like C++ or Rust can take enough time to interrupt flow if there’s no incremental option. So, a quick compile for the edit-compile-test cycle is useful. Then, the full compile should be able to run in the background so developer can continue flowing onto the next task while it happens. Minimal waiting on developer’s side. You can make that full build as fast as you want. Although, the biggest projects might still build overnight if it’s a clean build.
“ If you’re quickchecking a mildly expensive operation for example, then you might want to turn optimizations on for tests.”
Sure. Most tests I see are for correctness, though. Running them through a REPL/interpreter works until they’re a lot of them or they’re time consuming. The fast, non-optimized build helps with that. The most, exhaustive ones will use an optimized build.
Most tests I see are for correctness, though.
QuickCheck tests are for correctness. If you haven’t heard of QuickCheck before, it’s a combination of a fuzzer and a shrinker, so it can run a number of tests. It’s often referred to as “property based testing,” although I find that description a bit restrictive. If your tests are slow, QuickCheck can exacerbate it.
I don’t think you really addressed my central point: if it’s OK for optimized builds to be slow, then it becomes harder to work on performance sensitive code.
It’s pretty amazing. I regularly recommend it. :)
“I don’t think you really addressed my central point: if it’s OK for optimized builds to be slow, then it becomes harder to work on performance sensitive code.”
I did address it even though it’s not actually countering anything I say. It’s a fact that ultra-optimizing compilers will always take more time than interpreters or barely-optimized compiles. Gets worse the more optimizations you do. The worst in time probably being those that compile program with instrumentation, run againsts test suite for profiling, and do second compile to optimize whole program for that profile.
As I said before, make your optimizing compilers as fast as you want. They’ll never be as fast as other types, though. The delay will always be higher. That speed difference is reason many people won’t do source-based Linux distros. Compile-to-usable system takes forever where Golang-fast compile to usable system followed by GCC-optimized compile in background and update on reboot would be much more usable.
In the general case, I said Im preserving flow. Someone might loose that waiting even a minute or maybe less. C++ and Java compiles used to slow me down since they took too long. LISP’s low-optimization, incremental compiles took 0.25 seconds on CPU’s from 10+ years ago. The Oberon/Pascal family, which Go fits in, are known to do 100,000+ lines a second with good but not ideal running speeds. That most apps are built in PHP, C#, Java, Python, Go, etc with developers prefering rapid development over raw performance indicates this is meaningful capability to have.
So, once again, Im not arguing optimized builds should be avoided or remain slow. Im saying less or non-optimized builds that happen as instantly as possible should be available to speed up iterations and maintain focus. If optimized is that fast in that project, then default on that. If not, it can be done later when necessary or just beneficial. On non-performance-critical apps, it can be done in batches throughout the day or overnight bh the build system. You come in next morning to find same code you speced, built, and wrote tests for yesterday now runs significantly faster today with testing report. Might have even done that profile-based compilation I mentiomed that takes so long since you didnt personally have to wait for it.
I did address it even though it’s not actually countering anything I say.
What I’m interpreting you saying is that we should care a lot less about how long an optimized build takes. I simply strongly disagree with that emphasis. I don’t really disagree with much of your comment and I don’t understand why you feel it necessary to explain that optimized builds will take longer than non-optimized builds. I also don’t understand why you’re trying to tell me how important flow is. Neither of those points are uncontroversial and I’m not disputing them.
Allow me to re-focus my point. You said:
From where I’m standing, this goal is an explicit acknowledgment that not only will optimized builds be slower than non-optimized builds (which is utterly uncontroversial) but that it’s OK for optimized builds to be so slow that it breaks flow. I want optimized builds to be fast enough that they don’t break flow, hence the reason I’m talking about working on performance critical code.
For a lot of the code I write, I don’t care at all about optimized builds until I actually need to run something. But I’m not talking about that. I’m talking about my flow when I need to do development with optimized builds. If it becomes “okay” for an optimized build to take as long as an entire evening, then you’ve completely destroyed that flow that we both recognize as important for folks that need optimized builds.
I’m not saying any of the following:
In summary: we both agree flow is important, but I want to put a stronger emphasis on the importance of flow when optimized builds are necessary for development.
I think the problem here is that your position could be easily interpreted as “I care more about flow than about getting every possible optimization into a build.”
What nickpsecurity is suggesting is that we should make sure to allow for full optimizing builds to take however long they need so we aren’t tempted to ignore optimization chances simply because they take longer than we would prefer–which is of course a background artifact of your position.
I’m not really sure how to clarify my position further, but your interpretation is certainly not what I intended to imply. My entire issue is with the emphasis placed on it being “okay” for optimized builds to take a very long time. Instead, I think we should strive to produced optimized builds in a way that doesn’t break flow. I recognize that this may be a seemingly competing goal with producing optimized builds in the first place, but I don’t see it that way. It just means we shouldn’t be complacent about optimized builds being horrendously slow. (I’m not saying anyone is today, or that this is the reason why optimized builds are slow today, I’m specifically talking about @nickpsecurity’s vision and disagreeing with their stated goals.)
That’s also an angle in my post that I could’ve clarified better. The goals of people wanting optimization are (a) max performance at runtime and (b) fast compile during development. zyhe time taken for (b) will usually go up as extra results are obtained on (a). The best optimizations available, which a number of companies use, do whole-program or staged analysis. Nobody has gotten these to barely perceptible speeds since they do a lot of analysis or even runtime tests. So, the developer wanting fast compiles will be forced to choose a lower level of optimizations to get them. Or buy a cluster with low-latency interconnect plus rewrite the compiler in question for parallel runs in MPI or something.
So, today’s tech is never bringing us ultra-fast iterations with ultra-optimized compiles outside HPC clusters. Even if you’re “optimizing,” you’re still going to be sub-optimal if compile times matter. Just a question of how much performance you want to sacrifice to get what upper-bound on compilation speed. My model implicitly sacrifices as follows: speed limit to run it in a LISP interpreter for exploratory programming, the limit of Go compile/runtime as baseline for average app that isn’t performance-critical, and with a batch (or parallel) solution for time-consuming builds with high-optimization. That is, until someone figures out how to get the latter to speed of a non-optimized compile.
Appreciate your further clarification. Then we’re mostly in agreement about things. The problem is:
“I’m talking about my flow when I need to do development with optimized builds. ”
That’s actually where the vast majority of compiler development has gone other than supporting language features. The result is the most optimizing compilers do perform faster than they use to. They still break flow on a lot of projects. Further, if you’re using whole-program analysis or profile-based ones… highest forms of optimization outside assembly… then flow might be pretty hopeless given all the analysis involved, esp runtime ones. The solutions many used back in the day were Beowulf clusters where a whole wall of machines analyzed, compiled, and tested each file or whatever. A big problem…
So, it’s worth continuing to improve on. Ideally, we could crank out highly-optimized builds at imperceptible speed. It just may not be possible depending on what level of optimization you want. So a critical assumption of my original comment was that we can’t currently and may never be able to do that. Hence, several speeds of compilation for various use cases.
However, there are two other possibilities outside us just getting lucky. I noticed looking at optimization papers over the years that the speedups you get on specific types of problems often happen in specific ranges that might allow estimates. In hardware, they also have the problem that optimizing compiles (physical synthesis) take way too long. What they came up with were models that let you estimate speed, power, size, etc based on the code. They were accurate enough that you could assess and iterate your HW code pretty well before any physical synthesis. Similarly, researchers might apply highly-optimizing compilers on as many FOSS components as possible to see what unoptimized vs 01 vs 03 does to each. Then, a compiler plugin can be created that combines analysis with your annotations or tests to show you how much better version B was than A or vice versa along specific metrics with performance estimates from likely to worst case.
Again, this is all in addition to continuing trying to make full optimization as fast as possible. Thought you might find it interesting given it was already done in hardware. Software would likely have to be constrained to pull it off, though. like with WCET analysis in embedded. My favorite route for now is something like QBE backend where we find what few optimizations get most bang for the buck to keep compiler simple and fast. Someone also wrote a genetic algorithm once for Haskell compiler options to find ideal combination for a specific program. Basically, doing more with less CPU work given that all optimizations aren’t equal in cost-benefit. Maybe also allowing the programmer to customize it on a per-module basis to ignore what didn’t matter. IBM’s PL/S language had a primitive form of that.
NanoPass techniques are one of most promising for implementing these fast, optimizing compilers with user-preferred optimizations. I include the intro paper plus a dissertation showing they’re already at max 2x slower. If original is within flow range, then stuff built that way might be too. Especially once they write it in hand-optimized C, Ada, or Rust. ;) They also have a Github for experimentation.
That’s actually where the vast majority of compiler development has gone other than supporting language features.
I know this, and I was not disputing it. In fact, for what they do, I’m pretty happy with the performance of optimizing compilers today. I was disputing your stated goals as things I disagree with. My apologies, but I don’t know how to clarify my position any further, and I suspect we agree more than we disagree.
“My apologies, but I don’t know how to clarify my position any further, and I suspect we agree more than we disagree.”
Yeah, I think we’re mostly on the same page. That the disagreement kept getting more narrow supports it. Have a good night. :)
Beyond all the things /u/burntsushi mentions, rustc also has a mode that just typechecks and does no translation. As the author notes, a lot of time is spent in LLVM, but this avoids LLVM completely. It’s available through cargo using cargo-check.
I appreciate the two of you mentioning it. I’ll keep it in mind for future comments as that’s a good feature.
it would have been nice if he timed compiling rustc with this new version vs the old version.