1. 37
    1. 14

      I have somewhat mixed feelings about Zig’s build system, at least in its current state. Roughly, as of today it is a generic two-step build system, where the result of running a build.zig is a graph of objects in memory, which represent build steps, which is subsequently executed by the built in build runner.

      I am not a fan of this setup, because it is already quite complex, opinionated, but doesn’t provide many nifty features.

      It is complex because, when you need to do x, you can’t just write the code to do x. Rather, you need to write the code which describes how to do x. That is, instead of

      fn do_foo() {

      You need to do something like

      var foo_step = b.addStep(“foo”, DoFooStep{});
      var bar_step = b.addStep(“bar”, DoBarStep{});

      This might be a worthwhile price to pay if build system does something smart with the resulting graph (auto-parallelizes is, makes it incremental with early cut-off, makes is distributed, etc), but this isn’t actually happening. I think that until very recently Zig was just running the nodes in order, with a tiny check to not re-run a task twice (which is trivial to do in fn do_foo() case by hand).

      Additionally, this build-system is pretty much hard-coded. I think I’d rather have a generic task runner instead, where build.zig looks like

      const std = @import(“std”);
      pub fn main() {
      fn build(b: std.build.Build) {}

      That is, I would love to have an option of using current build infra if I need the logic contained therein, while also having an option to just manually shell-out to zig from main if I need something more direct.

      What is exiting about build.zig though is the plan to separate build graph generation from execution. Generation is done by a pure sandboxed user process, while execution is done by a trusted host process. This adds extra restriction — build graph is now data, rather than a graph of closures. And this would make it possible to do various fancy things. In particular, that combines with multi builds would be a boon for IDE support, as the IDE would have a full, compete picture of the world.

      Though, even in the world with Wasm build graphs, I’d still want a task runner, because that’s so much less fragile than a shell :-)

      1. 14

        IDE integration is my main motivation for the two step / declarative build system. Idea being of course that an IDE could understand how the build works semantically rather than blindly executing a sub-process and hoping that it did something useful. This will allow IDEs to communicate with the compiler via an interactive protocol and do language server things - across the entire build graph, and potentially across multiple builds, like you mentioned. I think I’m just repeating what you said in my own words, but suffice to say that IDE integration is exactly the point.

        Secondary motivation is security - ability to compile something without trusting it. That’s not implemented yet, but the two step / declarative system is poised to go down that path.

        1. 7

          Yup, that makes sense and is a key fact for grokking the design. I was very puzzled that Zig is “doing gradle”, until I saw that issue about sandboxed WASM builds.

          Really waiting the day when there’s an IPC boundary between the code that generates the build graph, and the code that executes it. In Cargo, that was the original intended design, but the separation didn’t happen and, as a result, IDE’s view of the code is not full-fidelity. I also worry that not having this boundary might make people accustomed to extra capabilities…. Like, Rust didn’t sandbox macros, and sandboxing them know would be a tall order, because folks love their compiler talking to postgress. Good thing that Zig has an option to continue breakings!

          I guess my main worry right now is that build.zig mixes two things:

          • abstract, pure description of compilation graph, for IDEs and compilation, which is transitive (you build your deps)
          • impure tasks which do things in the current project (shelling out to gh to put out a release, running custom project-specific lints, running particular test workflows), which are not transitive (deps don’t have tasks, because that would be insecure)
      2. 6

        I have lots of good things to say about Zig, and lots of good things to say about Rust, but I think the homogeneity implied in their build system designs doesn’t match reality.

        For example, Rust was initially aimed at Firefox, and it’s a big mix of C++ and Rust now (not to mention lots of JS and more). It probably won’t be all Rust for a decade or two, if ever. (7-8 years after Rust 1.0 and I would guess it’s <50% Rust, though I could be wrong). Someone should write a blog post about their build system – I’m guessing it’s not Cargo.

        And I admire Zig for the great lengths it goes to to preserve existing investment in code, i.e. zig cc. So it got that part of the heterogeneity problem right.

        But I saw your recent posts on Zig. So now what happens to all the Rust you wrote? Are you just never going to use any of it in Zig program?

        That seems unrealistic to me. It’s a waste of effort in the open source ecosystem to be reimplementing the same thing over and over again in different languages [1].

        OK well I’m sure the Zig build system can invoke the Rust compiler, and I don’t know the details (having never used it). But I see this tendency to put those concerns “off the happy path” and it can break a lot of the intended design properties of the system. (Off the top of my head I’d guess that cross compilation is messed up in that case, and probably ASAN too.)

        To make a concrete suggestion instead of just criticism – Ninja works great on Windows and is apparently even bundled with Visual Studio (?). It’s dirt simple, fast, with tons of parallelism, and any language can generate it. The main thing it lacks is sandboxing to help you find missing deps, but that’s OS specific, so very few build systems have it. So there could just be a Zig generator for Ninja, which composes with other tools that generate it, for polyglot builds.

        So one language isn’t privileged over another.

        I think there should be name for this very common belief – or I would even call it a wish – i.e. the Monoglot Fallacy or something like that.

        People want monoglot systems because they think in a single language, and a language often gives you important global properties. But real systems are composed of multiple languages.

        Here’s a good example that came up – the Roc language has a lineage from Elm, and its compiler is in Rust, and its runtime is in Zig.


        Why does Roc use both Rust and Zig?

        Roc’s compiler has always been written in Rust. Roc’s standard library was briefly written in Rust, but was soon rewritten in Zig.

        The split of Rust for the compiler and Zig for the standard library has worked well so far, and there are no plans to change it.

        After writing both compilers and language runtimes, this makes perfect sense to me. Rust has a lot of properties that are good for compilers, but not as good for language runtimes and garbage collectors (e.g. the need to bridge static memory management and dynamic/automatic, safe and unsafe).

        Likewise I’d say Zig is weaker at expressing a typed IR. The “data driven design” talk by Andrew Kelley last year was great, and I took some influence from it, but type safety is a downside (i.e. T* is type safe but an index into T[] isn’t.)

        My overall point is that all languages are domain-specific. Even two pieces of code as related as a compiler and its runtime can deserve different metalanguages.

        And shell is what you lets bridge domain-specific languages. I think of Make/Ninja/Bazel/Buck as kinds of shells, since they all use the process interface. And I take the point about shell being fragile – that’s mainly a versioning and deployment problem, which needs to be solved, and I have prototypes for. Actually Bazel/Buck solve that problem in their own context, to great user satisfaction, with a monorepo.

        [1] There is some irony here, but honestly in ~7 years, approximately zero people have asked me why I’m not building Oils on top of bash. That seems obvious to most people.

        1. 5

          I guess the tl;dr take is that Rust people want everything to be Rust, Zig wants everything to be in a language that can be built by its build system.

          But in 10 years we’ll have BOTH Rust and Zig (and C and C++ and Go and Swift …) , glued together with shell scripts :-P

          1. 4

            Let’s consider further your example of an application that uses both Rust and Zig among other languages, glued together with some third party build system or shell scripts.

            In this example, we have a dependency tree that is partially composed of Rust packages, and partially composed of Zig packages. Presumably, most, if not all, of these dependencies are generally and independently useful. If they only were useful for the main application then their code would simply be in the main application and not packaged separately.

            For each of these separate, independent packages, they only depend on Rust, or only on Zig. This allows an entirely different subset of contributors to collaborate on these packages, without having to deal with the environments of the other language. For example, I could be a Zig programmer on Windows, and the only thing I need to install for my workflow is Zig. That’s a real benefit which could mean the difference between whether some contributors find it convenient enough to even bother with, which in turn could mean the difference between whether certain bugs are found and fixed or not. On the other side of the fence, a Rust developer working on one of the Rust crates need not concern themselves with Zig versions, installing Zig, or even be aware that Zig exists at all. For them, using cargo alone is a simplifying ability.

            So, perhaps at the top level, the main application, we have some shell code gluing things together (or Rust build system could invoke zig build, or zig build system could invoke cargo). Regardless, the development processes of most of the dependency tree benefit from the language-specific package managers.

            1. 3

              I don’t really disagree with that, but I’d also say there’s just a lot more diversity in build systems in open source and inside organizations. That workflow will be one of them, but there will probably be a whole lot more.

              IMO it’s desirable to avoid the problem where you have the “happy path” for the monoglot build, and then you “fall off a cliff” and have to rewrite the whole build system when you need to integrate with other software and especially other ecosystems, which have their own inertia.

              I agree there should be a happy path, and there will be many cases where Zig is the main().

              But what about when some other tool/language is the main() – Is the toolchaing design modular / composable / a la carte ?

              I’m not sure if Zig actually has that problem, I don’t know enough about it… If there’s a separate build graph stage that’s good, and if there’s some kind of zig build-unit that can plug into a Make/Ninja build system that’s good. And if it can export the equivalents of compile_commands.json or gcc -M build graphs, that also helps.

              Here are some examples of what I’m thinking about – 8 different ways to package Rust in Nix.



              I went through a lot of the same thing with Bazel at Google – they basically rewrite the build system to EVERY open source package. This “boiling the ocean” is A LOT of work – it’s millions of lines of just build configuration. (I wonder how big nixpkgs is ? Are there a million lines of Nix expressions?)

              It’s also a big issue for Python and R – all those languages have their OWN build systems for shared library extensions. And they are portable to Windows too!

              It’s nice to just use the upstream tools and not rewrite the whole thing, which I have painful experience with.

              And Julia, PHP, node.js, Perl, etc. all have some variant of this issue It would be nice to be able to easily use Zig to enhance those ecosystems, in a piecemeal way, without “framework shear”.

              The Bazel stuff doesn’t even count Android or Chrome, which have entirely separate and huge build systems:

              Integrating Rust into Android: https://security.googleblog.com/2021/05/integrating-rust-into-android-open.html

              Rust seems to have realized this problem pretty late: https://users.rust-lang.org/t/rust-in-large-organizations-meeting/32059

              I don’t have a link handy, but I imagine that integrating Rust into the Linux kernel, with its custom build system, is also a similar issue. Basically every big project has its own build system that it wants to use.

              I could definitely have some misconceptions since I’m only looking at things from afar, but I watched the recent “GoTo” talk on Zig (which was very good BTW), and my impression was that the build system is meant as a big “onramp” to Zig.

              But to me that is the case when Zig is the main(), and C/C++ are libraries you want to reuse.

              I’d also be interested in the story when Zig isn’t the main() – that seems like an equally important on-ramp. Are there any docs about that?

              Like say if SerenityOS wants to use it, which apparently has a huge CMake build system:


              So I think there should be a balance. It also seemed to me that the build side of things could take up a very large portion of the project’s effort – i.e. how much work is writing a package manager / build system vs. writing the compiler + designing the language?

              I think most people think the latter is the real hard problem, but my experience is that the former is also pretty hard :) It’s hard in a different way – there are so many things to integrate with that you end up with a never-ending pile of work.

              There is a combinatorial explosion of

              • (3+ desktop operating systems + embedded systems) times …
              • (languages with their own build tools that want to use Zig + polyglot repos with custom builds systems like Nix/Bazel ) times …
              • (languages that Zig could use)

              It’s a big nightmare, and potential never-ending time sink, and I’d say that stabilizing a small set of modular, polyglot Unix-like tools can go a long way, in addition to having something more monolithic.

              1. 2

                Thanks for the broad enumeration of specifics around problems I’m currently interested in.

                Layered use of languages is to be expected. The Linux OS bindings are in C and a common C++ idiom is to provide ABI stability by an hourglass design. So, you’re likely to end up with C++ calling C calling C++ calling C. As soon as you refactor functionality to Zig or Rust, you have C++ calling C calling Rust calling C and you need a quality polyglot build system.

                Building ecosystems like package managers and build systems is hard if it needs broad coverage of existing software with existing design decisions made by other people. So, I’d say that something that is also pretty hard is getting buy-in and growing the momentum needed to reach the needed community velocity to match the velocity needed to cover:

                1. The rate of new software
                2. For the platforms relevant for that software
                3. With the design decisions like build systems or languages that software is maintained with
                4. Under the constraints of those use cases (HPC vs web vs military)

                To this end, my own projects focus on components that I see as meeting minimum requirements for composing with or leveraging existing ecosystem work that have maximal existing adoption. You mentioned Bazel which has the largest community in the polyglot build space, and you’d want a popular package manager that composed with it.

                Small, modular, bootstrappable, auditable standard tooling would be idyllic, but choosing harder tech problems to avoid people problems is more pragmatic in my view.

                1. 1

                  Yes the rate of new software is something I think about … if you’re going invent a solution that requires “boiling the ocean” to be useful, it should be boilable at a rate that exceeds the growth in new software :-)

                  Otherwise you’re just contributing to a combinatorial mess. That is, burning open source contributor cycles on parochial variants of problems that have already been solved.

                  And interestingly, Docker was the most recent thing in the build space that managed to do that. People make Docker images for OCaml binaries, etc.

                  And that’s because it didn’t require rewriting all your build tools and package managers. You just use apt-get, pip, whatever.

                  You lose a lot of good properties by doing that, and there are lots of downsides, but I think we can do better than Docker while still retaining that property.

        2. 5

          So there could just be a Zig generator for Ninja, which composes with other tools that generate it, for polyglot builds.

          Generated ninja build would not be a thin waist, because it is lossy. Generated build lacks semantic, language-specific structure of the project. You can’t feed ninja build into an IDE and have refactors working. I think the sweet spot is to generate a static description of dependencies, using language specific concepts (eg crate graph for Rust and class path for Java), and than use that static model to either drive the build, or to lower to something like ninja. Which I think is exactly the plan with Zig’s two-phase build.

          1. 2

            I don’t think Ninja and IDE are mutually exclusive … That was just a suggestion I threw out there, to have something constructive instead of just criticism. It could be something else, but Ninja is very flexible / Unix-y / a la carte and will give you a lot of leverage.

            I think a narrow waist protocol like the language-server is needed to solve the (M languages) x (N operating systems) explosion in build systems (and package managers).

            It’s bad enough if every language implements its own dependency graph and solver – but even worse when you start importing foreign code like C/C++, and need their dependency graphs. Then you end up basically going down the Nix/Bazel path – boiling the ocean.

            Narrow waists are composed of other ones – e.g. I noted that the LSP is built on top of JSON-RPC, which is built on top of JSON.

            So you can build on top of Ninja. For many use cases, you also likely want the gcc -M format that has evolved, mentioned here:


            Yes it seems ugly but it’s also better than doing combinatorial amounts of work.

            Similar discussion from a few days ago: https://lobste.rs/s/b0fkuh/build_faster_with_buck2_our_open_source#c_chn1fu

            I think language build systems and package managers are trying to be OSes, and that’s not a good thing. For example, dynamic linking is an operating system feature more than a language feature, and works differently on Linux, OS X, and Windows.

            FWIW Steve Yegge has been working on the polyglot IDE problem on and off for ~15 years.

            He started the code search platform inside Google, which basically instrumented a bunch of compilers to emit a common protobuf data format for symbols.

            For jump to definition, and find all uses across C++/Java/Python.

            It would run Bazel on the entire monorepo (i.e. unified, hand-boiled depenendency graph) at every version of the repo (i.e. a task that requires a distributed system running on thousands of cores). I’m pretty sure it must have evolved some incremental computation system, though I don’t know the details.

            It worked across generated protocol buffer code too. (Google largely uses the cross-language type system of protocol buffers, more so than the types of C++/Java/Python.)

            It was open sourced as this project, but maybe it’s dormant since he left Google a long time ago?



            The company SourceGraph was started by somebody that used code search at Google. Now Steve is a leader there working on the same thing – but for an even harder problem, which is IDE-like information without a consistent global version number.


            i.e. lots of little git repos, rather than a monorepo.

            So anyway I guess my overall point is that language server solved an (M editors) x (N languages) problem (although sure it’s big compromise compared to say JetBrains)

            And likewise I think there should be some way to avoid doing combinatorial amounts of build system work to extract information useful to an IDE. It may probably involve some broader collaboration / protocols / spelunking inside other language ecosystems.

            I never looked too closely at the Kythe/SourceGraph stuff, but I’d be surprised if there isn’t something to learn from there

            1. 1

              Then you end up basically going down the Nix/Bazel path – boiling the ocean.

              One observation here is that Nix did boil the ocean. nixpkgs contains many more packages than traditional unixy repositories


              To me, it signals that the problem is not so much diversity, as unstructuredness.

        3. 1

          Why are you not building oils on top of bash?

          1. 2

            :-) Basically because the bash codebase has run out of steam, and the maintainer has pretty much said that. It’s from 1989 or so, and if you compare it to CPython from 1990, it’s much more brittle and has less room to grow.

            I think after 3 decades nobody really has qualms about a rewrite – it’s more the Zawinski “CADT” syndrome that we want to avoid.

            CPython has a pretty healthy ecosystem of interpreter forks and improvements, which eventually get merged back. bash has no such thing; it’s pretty much the maintainer, and he exports all his private changes to VCS without descriptions, rather than using VCS the normal way.

            (That said, dgsh is one of very few projects that have built on top of bash. Though it started as an academic project and I’m not sure if many people use it.)

            Another big fundamental reason is that Oils has a completely different architecture – it’s written like a programming language, not a terminal-based UI with a language bolted on afterward.

            In the last 3 decades this architecture has become more or less standard – the parser can be reused for linters and formatters, so you can have things like ShellCheck / shfmt, but also extended to YSH/Oil, etc. That is impossible bash’s architecture.

            I wrote a lot about parsing shell ( https://www.oilshell.org/blog/tags.html?tag=parsing-shell#parsing-shell ), but the runtime is also a big issue, e.g. with respect to handling every error.

            In other words, it’s a total overhaul of every part of shell!

      3. 2

        This might be a worthwhile price to pay if build system does something smart with the resulting graph (auto-parallelizes is, makes it incremental with early cut-off, makes is distributed, etc), but this isn’t actually happening.

        If you haven’t seen what’s in latest master branch, I have some very good news about this :^) Zig build now runs tasks in parallel and the new API lets you define expected maxRSS for each task, so that the runner doesn’t overload the system.

        I have a mac m1 studio and it takes 6 minutes to run zig build test -Dskip-non-native (of the self-hosted compiler). This is the usage graph: https://ibb.co/FW9kpxT

    2. [Comment removed by author]