1. 86
  1. 19

    We have just published the Zig Roadmap 2023 talk that Andrew gave at the recent Zig meetup in Milan. It talks about what’s next for the self-hosted compiler. (spoilers: tons of speedups)

    https://youtu.be/AqDdWEiSwMM

    1. 13

      stage1 (release) build stage2 (debug) (LLVM backend):

      • wall clock: 47.80 seconds
      • peak rss: 8.6 GiB

      stage2 (release) build stage3 (debug) (LLVM backend):

      • wall clock: 43.26 seconds
      • peak rss: 2.3 GiB

      That’s a nice reduction in memory usage.

      Gives me hope that I’ll be able to build even large projects on my laptop, something I struggle to do with Rust.

      1. 10

        those are some fantastic times. Not many languages have compilers that can be built that fast!

        1. 15

          The big wins are still yet to come! If you’re curious to learn more, I went over these details in the video linked by @kristoff.

      2. 7

        How will you ensure that you can still build zig from sources in the future?

        1. 27

          By forever maintaining two implementations of the compiler - one in C, one in Zig. This way you will always be able to bootstrap from source in three steps:

          1. Use system C compiler to build C implementation from source. We call this stage1. stage1 is only capable of outputting C code.
          2. Use stage1 to build the Zig implementation to .c code. Use system C compiler to build from this .c code. We call this stage2.
          3. Use stage2 to build the Zig implementation again. The output is our final zig binary to ship to the user. At this point, if you build the Zig implementation again, you get back the same binary.

          https://github.com/ziglang/zig-bootstrap

          1. 7

            I’m curious, is there some reason you don’t instead write a backend for the Zig implementation of the compiler to output C code? That seems like it would be easier than maintaining an entirely separate compiler. What am I missing?

            1. 2

              That is the current plan as far as I’m aware

              1. 1

                The above post says they wanted two separate compilers, one written in C and one in Zig. I’m wondering why they just have one compiler written in Zig that can also output C code as a target. Have it compile itself to C, zip up the C code, and now you have a bootstrap compiler that can build on any system with a C compiler.

                1. 2

                  In the above linked Zig Roadmap video, Andrew explains that their current plan is halfway between what you are saying and what was said above. They plan to have the Zig compiler output ‘ugly’ C, then they will manually clean up those C files and version control them, and as they add new features to the Zig source, they will port those features to the C codebase.

                  1. 2

                    I just watched this talk and learned a bit more. It does seem like the plan is to use the C backend to compile the Zig compiler to C. What interests me though is there will be a manual cleanup process and then two separate codebases will be maintained. I’m curious why an auto-generated C compiler wouldn’t be good enough for bootstrapping without manual cleanup.

                    1. 7

                      Generated source code usually isn’t considered to be acceptable from an auditing/chain of trust point of view. Don’t expect the C code generated by the Zig compiler’s C backend to be normal readable C, expect something closer to minified js in style but without the minification aspect. Downloading a tarball of such generated C source should be considered equivalent to downloading an opaque binary to start the bootstrapping process.

                      Being able to trust a compiler toolchain is extremely important from a security perspective, and the Zig project believes that this extra work is worth it.

                      1. 2

                        That makes a lot of sense! Thank you for the clear and detailed response :)

                      2. 2

                        It would work fine, but it wouldn’t be legitimate as a bootstrappable build because the build would rely on a big auto-generated artifact. An auto-generated artifact isn’t source code. The question is: what do you need to build Zig, other than source code?

                        It could be reasonable to write and maintain a relatively simple Zig interpreter that’s just good enough to run the Zig compiler, if the interpreter is written in a language that builds cleanly from C… like Lua, or JavaScript using Fabrice Bellard’s QuickJS.

                        1. 1

                          Except that you can’t bootstrap C, so you’re back where you started?

                          1. 2

                            The issue is not to be completely free of all bootstrap seeds. The issue is to avoid making new ones. C is the most widely accepted and practical bootstrap target. What do you think is a better alternative?

                            1. 1

                              C isn’t necessarily a bad choice today, but I think it needs to be explicitly acknowledged in this kind of discussion. C isn’t better at being bootstrapped than Zig, many just happen to have chosen it in their seed.

                              A C compiler written in Zig or Rust to allow bootstrapping old code without encouraging new C code to be written could be a great project, for example.

                              1. 5

                                This is in fact being worked on: https://github.com/Vexu/arocc

                  2. 1

                    Or do like Golang. For bootstrap you need to:

                    1. Build Go 1.4 (the last one made in C)
                    2. Build the latest Go using the compiler from step 1
                    3. Build the latest Go using the compiler from step 2
                  3. 3

                    Build the Zig compiler to Wasm, then run it to cross-compile the new compiler. Wasm is forever.

                    1. 11

                      I certainly hope that’s true, but in reality wasm has existed for 5 years and C has existed for 50.

                      1. 2

                        The issue is building from maintained source code with a widely accepted bootstrapping base, like a C compiler.

                        The Zig plan is to compile the compiler to C using its own C backend, once, and then refactor that output into something to maintain as source code. This compiler would only need to have the C backend.

                        1. 1

                          I mean, if it is, then it should have the time to grow some much needed features.

                          https://dl.acm.org/doi/10.1145/3426422.3426978

                        2. 1

                          It’s okay if you don’t know because it’s not your language, but is this how Go works? I know there’s some kind of C bootstrap involved.

                          1. 4

                            The Go compiler used to be written in C. Around 1.4 they switched to a Go compiler written in Go. If you were setting up an entirely new platform (and not use cross compiling), i believe the recommended steps are still get a C compiler working, build Go 1.4, then update from 1.4 to latest.

                        3. 2

                          How do we build C compilers from source?

                          1. 3

                            Bootstrapping a C compiler is usually much easier than bootstrapping a chain of some-other-language compilers.

                            1. 4

                              Only if you accept a c compiler in your bootstrap seed and don’t accept a some-other-language compiler in your seed.

                              1. 3

                                Theoretically. But from a practical point of view? Yes, there are systems like Redox (Rust), but in most cases the C compiler is an inevitable piece of the puzzle (the bootstrapping chain) when building an operating system. And in such cases, I would (when focused on simplicity) rather prefer a language that depends just on C (that I already have) instead of a sequence of previous versions of its own compilers. (and I say that as someone, who does most of his work in Java – which is terrible from the bootstrapping point-of-view)

                                However, I do not object much against the dependence on previous versions of your compiler. It is often the way to go, because you want to write your compiler in a higher language instead of some old-school C and because you create a language and you believe in its qualities, you use it also for writing the compiler. What I do not understand is why someone (not this particular case, I saw this pattern before many times) present the “self-hosted” as an advantage…

                                1. 2

                                  The self-hosted Zig compiler provides much faster compile times and is easier to hack, allowing language development to move forward. In theory the gains could be done in a different language, but some of the kind of optimizations used are exactly the kind of thing Zig is good at. See this talk for some examples: https://media.handmade-seattle.com/practical-data-oriented-design/.

                                2. 1

                                  But you could make a C compiler (or a C interpreter) from scratch relatively easily.

                          2. 4

                            What did it mean for the zig self hosted compiler to be self hosted if it wasn’t previously able to build itself?

                            (I couldn’t figure it out how to make this not sound snarky, but it’s really not. I’m not a compiler/PL person.)

                            1. 3

                              It means previously it was a compiler written in Zig, which accepted some subset of Zig programs, which included many Zig programs but not itself.

                              1. 1

                                Thanks!

                            2. 3

                              Really good work getting put in to Zig

                              1. 3

                                Take off every Zig for great glory!

                                1. 1

                                  I don’t get it. :D Is this a reference to something?

                                  1. 1
                                2. 1

                                  Excellent stuff. Zig and Odin are two languages that I’m keeping a very close eye on.

                                  1. 1

                                    Congrats! I have enjoyed watching this project progress. I don’t (currently) have a use case for the language, but I love to hear about all of the cool design choices Zig has made and all of the success it has had.