1. 106
  1.  

    1. 30

      Rust’s compile times are slow. Zig’s compile times are fast.

      TBH, I’d take this with a grain of salt. Zig’s compile times are not that fast yet (as of Zig 0.13.0). For example, rebuilding TigerBeetle after a trivial change in Debug takes 7 seconds:

      matklad@ahab ~/p/tb/work ((1af79195))
      λ ./zig/zig build
      matklad@ahab ~/p/tb/work ((1af79195))
      λ vim /src/vsr/replica.zig
      matklad@ahab ~/p/tb/work ((1af79195))
      λ git diff
      diff --git a/src/vsr/replica.zig b/src/vsr/replica.zig
      index cbd54adfc..529d09e00 100644
      --- a/src/vsr/replica.zig
      +++ b/src/vsr/replica.zig
      @@ -7357,7 +7357,7 @@ pub fn ReplicaType(
                   // Using the pipeline to repair is faster than a `request_prepare`.
                   // Also, messages in the pipeline are never corrupt.
                   if (self.pipeline_prepare_by_op_and_checksum(op, checksum)) |prepare| {
      -                assert(prepare.header.op == op);
      +                assert(prepare.header.op != op);
                       assert(prepare.header.checksum == checksum);
      
                       if (self.solo()) {
      matklad@ahab ~/p/tb/work ((1af79195) *)
      λ /usr/bin/time ./zig/zig build
              7.42 real         7.33 user         0.37 sys
      

      That feels like it’s way too high. And, of course, this is mostly due to LLVM (and, upstream of that, due to monomorphisation).

      I think, as of today, the bigger difference for a typical project would be not the language choice, but rather how carefully, with respect to compile times, the project code is written.

      Though, Zig is definitely positioned to compile snappily eventually, both culturally (compilation speed is the main concern, folks are not afraid replacing LLVM) and architecturally (Zig intentionally doesn’t try to fit the compilation model from the 70s, where the main problem was that the source code doesn’t fit in RAM).

      1. 7

        I think Richard tasted the Zig custom backend juice already and wants more.

        Or, more seriously put, he does mention custom backends in the Gist, so I’m guessing he’s using that as the point of comparison.

        1. 6

          Yeah, I revised the wording to clarify that I’m talking about Zig features that aren’t stable yet. 😄

          We only have like 10K lines of Zig for the standard library, and the new compiler stuff is even less than that (so far). The bet is that those features will have stabilized before our Zig code base gets so big that we’d have painful compile times without them.

          In contrast, there isn’t anything comparable on the Rust roadmap in terms of performance improvements. The Cranelift backend has been WIP since 2019 when Roc had its first line of code written, so although I do expect it will land eventually, it’s not like Zig’s x86_64 backend which is (as I understand it) close enough to done that it may even land in the next release in a about a week, and which of course will be significantly faster than a Cranelift backend would be. (And as I understand it, an aarch64 backend is planned after x86_64 lands - to say nothing of incremental compilation etc.)

        2. 3

          Yes, I haven’t understand the take that Zig is faster than Rust. For now, a Hello, world! comparison favors Rust, for instance. People probably need to be a bit more specific than that.

          1. 13

            It’s absolutely clear why Zig should be significantly faster to compile (assuming non-distributed builds), once all the things in the pipeline are finished:

            • The big one is a sane linking model. Rust and C++ approach to compiling generic containers is absolutely an emperor without clothes situation. It’s a gigantic waste to translate Vec<usize> in every compilation unit just to have all but one copy eliminated by the linker in the end.
            • I think lazy compilation likely would help a lot, but I don’t have hard numbers here
            • Parsing is embarrassingly parallel, while in Rust it requires name resolution, macro expansion, and, in case of procedural macros, arbitrary code execution.
            • I think comptime reflection fundamentally creates less work for compiler than syntactic macros
            • The compiler is architectured with performance as a primary goal, rather than a nice-to-have.
            • Things like hot-patching of binaries are on the roadmap.

            It’s just that, until you get rid of LLVM, you can’t actually measure the speed of everything else. Maybe codegen is fundamentally so slow that you don’t have to really optimize everything else, besides the linking model, but I am 0.7 sure that Zig+native backend will run circles around Rust+cranelift.

            1. 4

              I have read your blog articles since you started working in Zig and have been super curious about the process — hence why I embraced it as well. I agree that it’s expected that Zig should be faster (I think I mention it in my article). In the end, Rust has much more work to perform (whether we talk about proc-macro or not; think all the static checks it does regarding ownership, for instance).

              As mentioned as first sentence of my article, it’s a love-hate relationship, and it’s pretty hard to fully appreciate the language right now because of all the holes it has everywhere.

          2. 2

            Our rust compiler implementation definitely could compile a lot faster. A huge pain to the compile times is the fact that it grew organically. If we were to rewrite it again in rust, I’m sure it would compile a lot faster.

            These times are from an M1 mac. They are approximately the same as an intel i7 linux gaming laptop (used to be the m1 was way faster, not sure when they became even). All of the below is just building the roc binary. Building test and other facilities is much much worse (and we already combine many tests binaries to reducing linking time, though we could do it more).

            After change something in the cli (literally zero dependency and best case possible):

            Finished dev [unoptimized + debuginfo] target(s) in 4.15s
            

            After changing something insignificant in the belly of the compiler:

            Finished dev [unoptimized + debuginfo] target(s) in 16.95s
            

            And for reference, clean build:

            Finished dev [unoptimized + debuginfo] target(s) in 1m 58s
            

            And for reference, rebuilding tests (just build, not execution) after the same belly of the compiler change:

            Finished test [unoptimized + debuginfo] target(s) in 33.55s
            
            1. 10

              I don’t think you’d necessarily need to rewrite it. I faced similar problems in the past with the compiler for Inko and found that moving a bunch of code into separate crates helped improve compile times dramatically. I also aggressively reduced the amount of dependencies (as much as that’s possible in a Rust project at least), which also helps.

              To provide some numbers: a clean debug build on my X1 Carbon (which has a i5-8265U) takes about 24 seconds, while a release build takes 45 seconds. An incremental build for a trivial change takes about 2-3 seconds at most. That’s for about 80 000 lines of Rust code, including tests.

              In short, a rewrite can help but I suspect there are easier ways to improve compile times that don’t require a full rewrite.

              1. 3

                while compile times are a factor I think that:

                1. We already know we need to rewrite much of the compiler for other reasons (correctness, robustness, maintainability)
                2. We generally find zig makes better tradeoffs for writing a compiler like roc than rust, so we would prefer to use it.
                3. We have tried to move the needle on compile times multiple times in the past and it has not been fruitful.

                Also, we have 300k+ lines of rust currently. So it may partially be a scale thing.

                1. 2

                  Last I checked, the Rust compiler was much larger than that.

                  (I have no opinion on a rewrite other than I think it would be cool.)

                2. 2

                  I would love to know what’s causing this, but that’s probably a lot of work to profile. Are you using a faster linker than the default one, like mold?

                  1. 2

                    Linking is definitely a bottleneck when building tests. We have seen no gains with mold over lld, but see significant gains with lld over system linkers.

                    That said, even ignoring linking, we still have a pretty heavy rust compilation loop. I think part of it is bad structure leading to a lot of pieces recompiling for minor changes (though some of it is likely more fundamental).

                    Also, I would assume the biggest gains will be when we can use zig’s self hosted backends.

                    1. 1

                      We tried mold and unfortunately it didn’t move the needle noticeably; we didn’t even bother to add it to CI or to write a readme note about how to configure mold for local development.

                      1. 1

                        I can somewhat echo these findings: using lld or mold over the Linux system linker can have a dramatic improvement, but the difference between e.g. lld and mold is quite small and seems to depend greatly on the project in question. I also could’ve sworn the difference used to be bigger (as in mold being faster than lld), so perhaps lld’s performance has improved recently. For example, if I compile Inko’s test suite without optimizations (producing 1300 object files in the process), the link timings are as follows:

                        • GNU ld: 0.92-1.0 seconds (it varies slightly between runs)
                        • lld: 0.21 seconds
                        • mold: 0.26 seconds
                  2. 1

                    Zig intentionally doesn’t try to fit the compilation model from the 70s, where the main problem was that the source code doesn’t fit in RAM

                    I’d like to understand this better. How does Zig differ in this regard, as compared to, say, Rust or Go?

                    1. 24

                      The way C compilation work is that you compile each individual .c file, in isolation, into an object file containing machine code, and then combine the resulting set of object file into a single exe using a linker.

                      This compilation model is incompatible with generic programming / templates / monomorphiation. If you have a function with a type parameter in a.c, you can’t compile this function down to machine code, until you see the actual type it is being used with. Eg, if you have fn mystery<T>(t: T), you can’t really compile mystery, you can only compile mystery::<i32>.

                      Languages like Rust and C++ want to re-use the C compilation model, but they also have monomorphization. So, the way that do this is via redundant compilation and linker-stage pruning. If two compilation units use mystery::<i32>, than it will be compiled twice, in the context of each compilation unit. When linking the object code of the two corresponding units together, the redundant copy will be eliminated.

                      Zig solves this by not compiling things in isolation. Everything is compiled at the same time, so monomorphizations are globally deduplicated.

                      1. 2

                        Very clear, thank you.

                      2. 16

                        In my understanding, which may be wrong, it’s basically that zig does two major things differently in this area.

                        The first is that it doesn’t support separate compilation. You compile your whole program, not compile some code into libraries that you then link against other code.

                        The second is that they do what the JavaScript folks call “tree shaking,” which is like dead code elimination but in reverse. That is, they look for live code, and compile only that, rather than compiling everything and then looking for things that aren’t referenced and can be removed.

                      3. 1

                        I think, as of today, the bigger difference for a typical project would be not the language choice, but rather how carefully, with respect to compile times, the project code is written.

                        I do think though that for compilers and stuff like that Zig makes it easier to write code that it compiles first, compared to Rust.

                        1. 7

                          I would think so, as, it seems to me, Zig would be more economical with its abstractions.

                          Though, both Rust and Zig make it rather easy to step into excess monomorphisation trap. At least with TigerBeetle we definitely are already in the territory where we monomorphise a bit too much, and we need to pare that back a bit (not because we hit excessive compile time/binary size already, it’s just that it makes sense to virtualize a couple of things, we didn’t get to it yet, and what lied on the path of least resistance originally was the bloaty solution).

                          1. 1

                            I do think though that for compilers and stuff like that Zig makes it easier to write code that it compiles first, compared to Rust.

                            Could you elaborate?

                            1. 1

                              I meant to say “compiles fast”. As for why I think Zig makes it easier: Zig’s does not care much about memory safety but gives you explicit control over memory instead. It makes writing very generic code less convenient than Rust, so you do less of that.

                              But as matklad pointed out, you can also fall into a monomorphisation trap in Zig. From my limited exposure to Zig I find that to be slightly less trappy though.

                              1. 4

                                Seems unlikely that upholding memory safety rules (data flow analysis + type constraints) would be the biggest distinguisher in compile times between the two. Usually, its processes that require slow/re-attempted evaluation like macros, comptime, and generics. Or uncached/serialized code generation (solved by codegen-units, incremental compilation, and faster linkers).

                                It’s also pretty common to write generic code in Zig. For example, std.io contains type-based stream wrappers similar to Rust Iterator composition. And most all TigerBeetle components are generics for DST/mocking.

                                1. 3

                                  I think it’s pretty easy to avoid overusing generics in Rust too. And if you need something heavily generic from a library, you can use the ugly trick or instantiating all the copies you need in a separate crate, so they can’t accidentally affect your incremental builds too much.

                                  Since using lots of generics is something you have to choose to do, rather than something you might do by accident, I feel like you can always just choose to not use lots of generics. Or choose to use more trait objects and fewer monomorphized generics.

                                  1. 1

                                    Or choose to use more trait objects and fewer monomorphized generics.

                                    I don’t know all that the Rust compiler is capable of,n so I’m hoping somebody can confirm my understanding. My understanding is trait objects are kind of like a Box<>, hence it’s kind of like a vtable pointer, so calling a method on a trait object is “chasing a pointer”. Is that always the case? Or are there cases where the compiler makes other choices? Just curious.

                                    1. 4

                                      yes, a &dyn Foo or Box<dyn Foo> has one pointer to the object and one pointer to a vtable. Calling a method needs to load the address of the function from the vtable. It’s not too bad, though, at least in my opinion.

                          2. 19

                            The Roc FAQ explains that, earlier in the project,

                            Roc’s standard library was briefly written in Rust, but was soon rewritten in Zig.

                            There were a few reasons for this rewrite.

                            • We struggled to get Rust to emit LLVM bitcode in the format we needed, which is important so that LLVM can do whole-program optimizations across the standard library and compiled application.
                            • Since the standard library has to interact with raw generated machine code (or LLVM bitcode), the Rust code unavoidably needed unsafe annotations all over the place. This made one of Rust’s biggest selling points inapplicable in this particular use case.
                            • Given that Rust’s main selling points are inapplicable (its package ecosystem being another), Zig’s much faster compile times are a welcome benefit.
                            • Zig has more tools for working in a memory-unsafe environment, such as reporting memory leaks in tests. These have been helpful in finding bugs that are out of scope for safe Rust.
                            1. 13

                              Zig is such a productive language. Maybe because there’s relatively few moving parts, but I feel I spend a lot more time solving my actual problem when writing in it.

                              1. 11

                                Very interesting! Gleam went through a similar process, with a full rewrite from Erlang to Rust. It was a lot earlier in the development process though.

                                I wonder how long they expect the rewrite to take, it looks like they have about 350k lines of Rust on their main branch.

                                1. 11

                                  We didn’t really try to make predictions, honestly - the discussion was more about “here are the things we’re planning to do anyway, and this seems like the most sensible way to do them.”

                                  How long will that take? We’ll find out! 😄

                                  1. 7

                                    I do hope you continue to post updates! I love finding out what’s happening in the Roc world, but I’m not involved enough to stay up to date without them.

                                    Break a leg!

                                    1. 6

                                      Thanks! I’m also not super involved in Gleam things, but it comes up regularly in language design discussions on Roc Zulip.

                                      You’re doing awesome things and it’s really cool to see Gleam seeing more and more use - here’s to both languages achieving their goals! 😃

                                2. 8

                                  Nice writeup, I also can’t deal with slow compile times – it takes me out of the zone. It seems like with corporate codebases, this is to some extent unavoidable – the successful projects are usually large and have long compile times.

                                  But open source should be fun, so I totally get the motivation

                                  Though I also wonder if there are ways to speed up the Rust build, and the tests. I also wonder if they will have a hybrid Rust-Zig program for awhile? Or is it going to be just an all new Zig compiler?


                                  My solution to this dilemma back in 2019 was to write a Python-to-C++ translator :-) [1] Some people understandably thought that was weird, but in retrospect I think it was a good tradeoff. Although it certainly took a long time

                                  • In development, there are zero compile times, because it’s Python.
                                    • But MyPy is expressive and strict, so you still get strong types.
                                    • Sometimes I just run the program, and sometimes I type check it first. You get to choose! And this is actually useful, otherwise you might get bored while waiting to “reverse engineer” bash :-)
                                  • In production, it’s fast C++ code.
                                    • We pay a cost for garbage collection, but that also makes it memory safe

                                  This also avoided a big rewrite! Although it actually took quite awhile to add Python type annotations, partly because we had to rewrite parts that used dynamism into “compile time textual metaprogramming”. And then whittle down the typed program to a subset that can be translated to C++.


                                  So I didn’t think of it this way initially, but the same translator achieved a bunch of goals

                                  1. Avoid rewriting by hand – even though we have very good tests, I would still be paranoid about all the corner cases we painstakingly figured out
                                  2. Avoid the typical tradeoff with slow compile times and fast runtime execution
                                  3. Speed up the Oils interpreter by 2x - 50x. Python is definitely not fast enough! :)

                                  The way I initially thought of it was as “spec driven”. The Python implementation is the executable spec for the language. In retrospect I think this also worked. When writing an interpreter, the problem is that you often get host language/runtime semantics “leaking” through, e.g. with integers, floats, and strings.

                                  So having 2 complete implementations forces you do figure out all those issues, and not just depend on the host language. When the host language C/C++, there is undefined behavior in integer math, so you don’t want that leaking through. Shell/awk/etc – all do that, they just borrow C integers from whatever compiler/platform you’re using.

                                  [1] Old post here - https://www.oilshell.org/blog/2022/05/mycpp.html - it is no longer as hacky, and I got to learn about type systems and compilers

                                  1. 7

                                    It’s a bold decision, but it’s well-argued. I wish them luck!

                                    That opening argument does seem to miss out an important point about compiler rewrites though…

                                    When a language gets rewritten in itself, it’s not just a proof that rewrites can be successful, or that “older + wiser = better”. It’s also a chance to put the language’s design and implementation to the test.

                                    Think you’ve designed the best language for writing programs? Prove that design by using it to write your compiler! Maybe you’ll find it’s not as pleasant to use as you thought, and you’ll have to rework the design.

                                    Think your compiler’s fast? Prove it by having to live with the compile times yourself! Maybe you’ll find it’s not as fast in practice as it is in benchmarks, and you’ll have to optimise the implementation.

                                    In writing your language in itself, you’ll inevitably find problems that you missed, and that will force you to make improvements to your language. It’s classic dogfooding, with every improvement you make for yourselves translating into benefits for your users too.

                                    Or to put that another way, a project this large will inevitably find problems with Zig, and any feedback will go into making Zig better. That’s good for programming in general, but is it a lost opportunity to have made Roc better? Perhaps… 🤔

                                    1. 9

                                      There’s a flip side to this. By dogfooding your language this way, you will inevitably make its design more catered towards writing compilers. Which may not be what you want, depending on your target audience.

                                      1. 2

                                        True, true. 😅

                                        1. 3

                                          Yeah, there are some use cases that I’m just intentionally not targeting with Roc, and compilers are one of them.

                                          That’s not to say I would discourage someone from writing a compiler in Roc if they want to, but rather that it’s a non-goal to make Roc be a language that’s great for writing a compiler in (except perhaps unintentionally). 😄

                                    2. 4

                                      Maybe that means I can build it and possibly contribute. I’ve had more luck with Zig projects than Rust, when it comes to running out of disk or RAM, but of course it’s difficult to compare.

                                      1. 2

                                        re: compile times sccache and in docker cargo-chef, work, and work well. I think they should be a safe default for anyone using rust daily.

                                        Re: the LLVM passes/steps/IR/etc etc etc. yeah, this feels like writing an LLVM layer. I’ve only written one LLVM pass (in C) and it was pretty hacky and gnarly. Doing a whole language IR seems to be sort of something they have thought of (rust has 2? or more now intermediate representations) https://github.com/roc-lang/roc/issues/6498 but it seems like bypassing all IR and API to go straight to bitcode w/ zig is an interesting choice. I’m not up on compilers enough to know if other languages do this, but … godspeed!

                                        1. 2

                                          re: compile times sccache and in docker cargo-chef, work, and work well. I think they should be a safe default for anyone using rust daily.

                                          Yeah, they are great tools. I have used them all well working on roc at various points. Still don’t solve the root rebuild slowness sadly.

                                          1. 1

                                            Yea, they are definitely more useful for app development than library let alone language development. Luckily Roc has a very unique corner case it’s trying to squeeze into. Given the out-of-fashion-again state of pure functional languages, I hope the zig bet works out.

                                        2. 2

                                          If I were to put my “incident” hat on for this kind of decision, it’d be interesting to know things like:

                                          • What did the authors know about Rust compile times before starting
                                          • How the compile times evolved over time, and/if how people intervened based on their knowledge, and how those interventions helped or didn’t
                                          • What risks do they anticipate after the transition to Zig (e.g. challenges in exposing APIs for third-party tooling while documenting ownership contracts, reduced safety when writing concurrent code etc) and how they plan to derisk those
                                          1. 3

                                            All really good questions. I can give some answers.

                                            What did the authors know about Rust compile times before starting

                                            Very little. When Richard started the project I’m pretty sure Rust was chosen cause he wanted to make sure he had a low level enough language to do any perf tricks necessary, but most of his background is in web backends.

                                            Many other contributors have learned rust as they started to contribute. Rust is actually one of our major friction points to getting new contributors. Many people want to contribute but find rust really confusing (yay lifetimes).

                                            How the compile times evolved over time, and/if how people intervened based on their knowledge, and how those interventions helped or didn’t

                                            They have been worked on a number of times. I never would have called them good, but we have done interventions that have significantly speed up compile time. One example is merging tons of testing executables to minimize link time. Another is switching everyone over to lld. Another is going through and removing tons of dependencies, when doing so making other dependencies always use the same version. There has also been a mix of work reorganizing crates at various points. Also, simply upgrading rust has made it faster on some machines.

                                            I would say the team is much much more knowledgable of rust now and many interventions have helped significantly. I think a rewrite back into rust would also help significantly. I do think essentially a rewrite is required at this point. A lot of the structure of the compiler and design of the passes grew organically in a way that does not help compile times.

                                            What risks do they anticipate after the transition to Zig

                                            Absolutely none. We are going to write the compiler flawlessly in a new langauge with no issues at all.

                                            Clearly not that. I think a large part of our learnings from the rust compiler is that a well designed compiler just uses large arenas and simple memory strategies. As such, a lot of ownership and memory concerns basically don’t apply to the compiler. Rust made us document a ton of lifetimes that were really unimportant due to practically being global. For reduced safety, zig has a lot of great defaults in their debug and release safe modes. On top of that, we plan to start fuzzing the compiler (including memory leaks) straight from the beginning. Fuzzing has traditional been problematic in our rust compiler cause we started too late. It has so many edge cases it can crash in that fuzzers essentially instantly crash.

                                            Lastly, a large reason for the rewrite is that we want to rewrite most of the compiler stages focused on maintainability and correctness first. As such, they have much simpler and consistent designs (these were all rewrites we were originally planning to do in rust). Starting even with a single thread version of the compile. Starting here and slowly scaling out will help us build something that is tested fully correct and slowly expanded to be fully feature rich and performant. This with lots of testing and fuzzing should really help build strong anchors around correctness.

                                            Also, it helps that we are good friends with the zig folks and they are able to help guide us some.

                                          2. 1

                                            I wish there was more pragmatism in these projects which are indeed attempts at good ideas indeed.

                                            Rocs design dictates that they desperately need a good set of platforms (as they call it) so users can pick from. The language is design in such a manner that if what you want to do has not been covered by the use case of a platform, then you simply can’t use the language for that purpose. But they chose to play around optimizing and even rewriting the compiler. I guess it’s fun, being an open source project, the developers are free to do whatever they want. It is just that from a practical point of you it is an horrible sense of priorities.

                                            1. 7

                                              Actually we already have plenty of widely-used platforms (basic-cli for CLIs, basic-webserver for servers, both of which are being used in production at work in a very small number of cases, plus roc-ray for native graphical apps, plume for generating static HTML/JS/CSS data visualizations from Roc code, roc-wasm4 for WebAssembly retro games, etc.) and the biggest piece of feedback I’ve heard from people using those platforms for building actual applications is that they’ve been running into problems that can only be fixed by changes to the compiler.

                                              We’ve talked extensively about fixing those compiler problems incrementally instead of doing a rewrite, and still ended up with this being the direction that made the most sense overall. It’s not like the idea of doing other things hadn’t occurred to us! 😄

                                              1. 5

                                                It seems a bit presumptuous to say that the developers of a language that hasn’t had a numbered release yet have horrible priorities. The language’s surface syntax isn’t even stabilized yet; it’s proposed to change substantially with the upcoming 0.1.0/rewrite version. Roc’s platform model, where effects are provided by swappable target runtimes, is the thing about it that I find most interesting, and I agree that a few cool and usable platforms beyond the basic-cli and basic-web they have now will be important for them to have to secure adoption. But programming languages and their implementations regularly have decades-long lives; exploring the fundamental possibility space of design and implementation before seeking widespread adoption is a perfectly reasonable priority.

                                              2. 0

                                                Not knowing anything about either language [1], if zig is so great then why does Roc exist?

                                                Also when I read “Roc rewrites” I automatically assume it meant roc the person, which confused me for a while.

                                                [1] I do asm, C, and lisp

                                                1. 11

                                                  Because the previous N attempts to build one universal language to rule them all, starting with Algol, failed, and it is generally believed that different use-cases need different languages.

                                                  And Roc in particular takes this insight further by explicitly designing for a two-language system, via the platform abstraction.