1. 53
  1. 30

    Some great quotes throughout the article, like this one:

    There’s no like Linus for Bazel [because] ultimately at the end of the day [Bazel] is some people trying to get promoted at Google.

    The problem [with Bazel] is it’s like been like seven years [and] Bazel is getting better, but it’s a Google open-source product. And Google’s basically like: ‘Hey, just do it our way and you’re probably an idiot, and you don’t have our scale. Your problems are kind of - like any intern at Google can sketch a solution to your problems. They’re not real problems’ and that’s not true.

    And this:

    You need people who are who can think on their own. If you have people who are used to, you know, copy-paste build engineering from StackOverflow, it might not be optimal.

    1. 6

      I met some people at Uber, trying to adopt Bazel (the only I’ve ever met outside of Google) and they had come to Uber from Google… so there’s that.

    2. 25

      Author here. I wanted to get my head wrapped around when Bazel was an excellent build solution, so I interviewed six people with a lot of Bazel experience and picked their brains.

      This somewhat long article hopes to answer questions about Bazel for future people in a similar position to me. If you know a lot of Bazel then you might not learn much but if you’ve vaguely heard of it and are not sure when its the tool that should be reached for I’m hoping this will help.

      1. 17

        When bazel works it’s a bliss and development cycle is fast. Caching is nice too.

        However, when you need to make changes to your pipeline, and your pipeline breaks, it’s the worst kind of hell. Bazel uses starlark - so editor support is poor. You can’t really script in starlark, you’re forced to implement interop with bash/python/your favourite scripting language via cmd arguments or inputs/outputs. So when your script breaks you’re left with debugging of multiple layers of abstractions. I had to deal with starlark, layered in top of go extension on top of python scripts executing bash scripts on the bottom. Not fun. And then you have gazelle or whatever BUILD files generation machinery. Because bazel is meant to be declarative, you have to use those if you need to have bigger control over your project. So on our Haskell project we’re using gazelle to configure bazel to call Haskell Stack to generate cabal files, and in the end generate BUILD files from cabal files to use Haskell rules to run ghc. And this whole cake is very brittle. In the end the build takes a bit shorter than with just Haskell toolchain, but there’s so much overhead when changing dependencies, and it’s hard to avoid. And then we have simple bash targets to start database or do some automation, and for some reason bazel quite often redownloads and recompiles whole go toolchain just to run a bash script. That last bit is probably a misconfiguration on our side, but debugging that isn’t that easy in bazel, especially when it reinvents a bunch of abstractions.

        My conclusion is pretty much the same as in the article - YMMV. Bazel solves a lot of problems, but the learning curve is steep and you are opening a new can of worms.

        1. 3

          I have a bunch of issues with Bazel, but Starlark isn’t one of them. It’s basically a cut-down version of Python, so any Python support can help. As far as it not being Turing complete, that’s on purpose to make it decidable. By making it primitive recursive, Bazel can actually analyse the work that needs to be done. If it were more general, it couldn’t. It’s a feature, not a bug. The only thing that irks me is that sometimes sets would be nice to have.

        2. 7

          I had a coworker ask me to explain some Make macro yesterday, and like.. it’s pure Greek. I wrote it! I wrote it as simple as I could make it, I really tried to make it comprehensible, but it’s still something you have to stare at for a bit to make sense of.

          I don’t mind that, because.. well, I don’t know, I don’t mind having to stare at things sometimes. What I get back out of Make makes the occasional need to stare worthwhile.

          But I wish there was something that had what Make has inside it - plus distributed caching - with a surface interface that was more friendly to beginners.

          I was bummed - and continue to be bummed - that Bazel wasn’t that.

          1. 6

            I feel like Bazel is soooooo cool in principle, but the thing that would be even more amazing is if someone extracted the sandboxing so that people could like… just write Python build scripts with caching.

            Ultimately the sandboxing is magic, the caching is pretty amazing, but Starlark and a lot of Bazel restrictions are built around people having so much stuff that even reading a bunch of config files is costly. But there are loads of people who have “reasonably”-sized codebases…. but have relatively simple needs.

            We used Bazel at $FORMERJOB to implement test caching stuff. Cutting the CI bill 60-70% (and of course improving throughput) was amazing! Things like “readme updates cause a full CI run” disappear, but without having to hack your way to that.

            1. 3

              Tup might fit what you’re looking for if you don’t mind writing your build scripts with Lua instead of Python.

            2. 2

              I’m not sure about this article. It sounds like a sales pitch. It doesn’t mention anything about shortcomings of Bazel (not counting the “steep learning curve” argument). I don’t really believe any article that’s one-sided. Example: C++ tips for migration. What is the argument here? “Bazel is easier to use for us.” Easier than what? What did they use before? How many projects are there? What are the operating systems? What are the IDEs the developers use? How the CI works? I don’t see anything else than only a “Bazel is good” argument.

              1. 11

                The article doesn’t take a strong stance, but it has some good quotes. To distill it a bit, I’d say:

                • Bazel is quite good if your org writes a lot of C++ (like Google), and somewhat less good for Java
                • It’s pretty bad for Python and JavaScript, as mentioned in the article
                • Bazel is smoother if you write and structure all your own code! Which I don’t think most companies do these days. That is, it has the “rewriting upstream” problem, which Nix shares to an extent. You get some nice guarantees for your rewriting, but it’s also an ongoing source of what seems like busy work (quotes in the article about this)
                • A related problem that Bazel and Nix share is “It doesn’t work with data/tools from language X ecosystem; therefore we will write code generators to bridge the gap”. YMMV on the code generators, but they often add a lot of complexity and reduce debuggability. It can inhibit users from forming a mental model and they’re left copying and pasting. (aside: personally I like Bazel’s Python-based language Starlark, though again this has nothing to do with whether you’d want to use it to build Python apps)
                • In case it weren’t clear from the above, Bazel is really for big companies with a monorepo. I don’t think it is useful for open source, which is more heterogeneous and lacks global versioning.

                I mentioned some of these issues in the past (from many years using it, which were admittedly a long time ago, but the issues raised in the article are extremely familiar)

                https://lobste.rs/s/ypwgwp/tvix_we_are_rewriting_nix#c_9wnxyf

                https://lobste.rs/s/virbxa/papers_i_love_gg

                1. 15

                  The longer I’m in this business, the more immediately skeptical I am of any attempt to bring HugeCo tools/practices/processes out into any other environment. They are no doubt well-adapted to their particular niches in their original homes, but they also are highly evolved to solve problems which tend to only exist in their original homes, in ways that only work in their original homes.

                  This extends to Bazel (which I’ve had promoted at me and managed to avoid), to k8s, to monorepos, to interview practices, to just tons and tons of things that just do not seem to work well at all once transplanted out of the specific places where they evolved.

                  1. 4

                    One thing that I think is relevant to the Python + JS thing (which I had to struggle through)… is that Python + JS tooling itself is just super bad at being hermetic by default. For example, when you install Python scripts through pip in a virtualenv, it rewrites the shebangs to point to the “right” python.

                    This is good in one sense, but it means that suddenly those scripts aren’t portable!

                    I think that this is the core thing, really. Can I take the build artifact and plop it elsewhere and have it still work? Lots of tooling does not handle that usecase well, and it makes stuff like Bazel harder.

                    There is a trick with Bazel, at least, and it’s to get around the hermeticity by just adding stuff to $PATH. What we were doing for a while was making an artifact off of our requirements.txt, but actually managing the python instance outside of Bazel. Same for Dockerfile stuff. Just throw it in there so you get cache invalidation, but don’t try to build off of it.

                    1. 4

                      Yeah the open source tools and Bazel just clash in every conceivable way. I’d also say that Bazel makes Python and JS feel more like Java and C++, which sort of defeats the purpose of using those languages.

                      A big mismatch is that both Python and JS have dynamic dependencies (unsurprisingly for dynamic languages :) ) and Bazel is very much about static, pre-declared dependencies.

                      I believe it’s especially bad for JS in the browser, where you want to iterate on UI details.

                      Although I guess a lot of JS is more static now, with all those big build tools. But I’d also say that the history is suggestive: Bazel was developed for C++, and most other languages, especially dynamic ones, are grafted on (including R build rules). And Google doesn’t use any of the JS tools developed since node.js was created in 2009 (not that people have good things to say about those either – build systems and software reuse are hard). Bazel basically creates a whole separate world that people aren’t used to, and you need big problems to justify “paying for” that world

                      1. 3

                        I’d also say that Bazel makes Python and JS feel more like Java and C++, which sort of defeats the purpose of using those languages.

                        When I worked at Google, they used blaze, and I guess bazel is a successor to that, or something. Anyway this checks out. My first thought when I used blaze was “wow, y’all went and added a long edit-compile-debug cycle to Python, just like C++ has. Why?”

                        I kind of get the reasoning behind it, but most folks aren’t really trying to solve the same problems at the same scale as Google. I haven’t heard anyone telling the software industry that “you ain’t Google and you probably wouldn’t want to be if you knew what was good for you”. But I think that’s a good message.

                        1. 1

                          Although I guess a lot of JS is more static now, with all those big build tools

                          Yes. ECMAscript modules and bundlers have made it so that JS dependencies are much more often statically analysable now. Dynamic require is discouraged and dynamic import is completely forbidden (but async import is fine).

                      2. 1

                        And it mentions it’s not good for libraries, it’s meant for artifact builds.

                      3. 7

                        If anything, it’s supposed to be a counter sales pitch as earthly, the startup that is running the blog, is offering an alternative build tools for Bazel. Despite that, I think the article did a great job staying neutral and laid out both sides of the coin.

                        I don’t think you could do Bazel (or similar tools) justice in 1 blog article. There are just a ton of nuances and aspects and dimensions to consider. Not only the tool is complicated, you business logic and requirements are as well. I often recommend folks on how to measure to conclude whether they need it or not, not straight up blindly recommending it as the silver bullet for all.

                      4. 1

                        Slightly off topic, but does anyone have a suggestion for a build system that doesn’t suck? For our CHERI microcontroller project we have a CMake build system, but we need to be able to build compartments and libraries, and then combine them into firmware images that also define a set of threads. Most of these concepts are new to CMake and it’s extension mechanism is pretty much nonexistent. I want to isolate users from all of the details of the build, so they just specify the set of source files for a library (and maybe some extra flags if they need them).

                        I tried rewriting it in xmake. This was a bit better. We could pass thread descriptions through as Lua objects with a decent set of properties and add rules for building compartments and libraries. There were a few annoyances though:

                        • 90% of what I wanted to do involved undocumented interfaces (the docs are really lacking for xmake)
                        • xmake really likes building in the source directory, which is unacceptable in a secure build environment (source will be mounted read only, the only writeable FS is a separate build FS) and you have to fight it every step if you don’t want to do this.
                        • I hit a lot of bugs. A clean rebuild always tries to link the firmware image before it has linked some of the compartments that it depends on and I have no idea why. Specifying the build directory usually works but then xmake sometimes forgets and starts scattering things in the source tree again.
                        • Overall, it feels like a 0.2 release and I’m not sure I’d want to handle problems users will have with it.

                        That sounds really negative, but I liked a lot about xmake and I’d probably be very happy with the project in a couple of years, it just isn’t there yet. For example, the build process in xmake is a map and fold sequence for every target (apply some transform to every input independently, then apply a transform to all of the sources). There is no doc with high level concepts explaining this, you need to figure it out.

                        1. 2

                          The concept of build systems is sucky. Have you tried Nix?

                          1. 2

                            Given that we want to build firmware images on Windows, Linux, FreeBSD and Mac hosts, it’s not clear how Nix would help.

                          2. 1

                            Have you looked at SCons. It’s written in Python, and so are the build scripts, so it’s quite extensible. It has also been around a while (well over 10 years).

                            1. 1

                              I know you’ve tried build2 and ran into some rough edges but if you want to give it another try, I would be happy to sketch somethings out for you (our documentation story, especially when it comes to the lower-level parts, is similarly lacking). Based on your description, I would first start with the higher-level ad hoc recipes/rules (these are like improved make recipes/pattern rules) and see how far I can get with that before considering re-implementing things as a build system module. Here is some introductory material for that if you want to take a look:

                              https://build2.org/release/0.13.0.xhtml#adhoc-recipe

                              https://build2.org/release/0.14.0.xhtml#adhoc-rules

                              https://build2.org/release/0.15.0.xhtml#dyndep

                              1. 1

                                I was put off Build2 for this for two reasons:

                                First, we have a lot of custom things. We need to construct a linker script based on the set of things in the final image, we need to modify the build of our loader and scheduler based on the number of threads added, and we need a structured mechanism for passing thread metadata from a consumer to the build rule for our firmware. xmake’s use of Lua is really nice for all of these:

                                • We can use a rich string library to construct the things that go in the linker script.
                                • We can pass Lua objects from the consumer to our rule (the xmake UI for doing this could be improved).
                                • We can inspect one target from a dependent one and modify it.

                                I believe Build2 would require us to write C++ code for all of these. Lua is a much nicer choice.

                                Second, and this might be my misunderstanding, Build2 feels a lot more bottom-up in its approach. Part of this means that it’s hard for me to see how you do context-dependent builds. For example, when we compile a C or C++ source file, it needs to know the name of the library or compartment that it will end up in and so we don’t want a generic .c / .cc rule, we want to build compile rules for each target. This is exactly the structure that xmake exposes as a compositional abstraction: there are compile rules, but they expose extension points per target. We need to change how a file is compiled depending on the kind of target that it’s ending up with and I couldn’t find anything in Build2 that looked vaguely this shape.

                                My ideal system would give users the ability to write something like:

                                # Somehow we need to tell it to pick up our build infrastructure.
                                include("cherimcu")
                                
                                # Okay, now we want a library built from a couple of source files.
                                library(helpers, [ "foo.c", "bar.cc" ])
                                # And a couple of isolated compartments.  Real projects will have a load more of these.
                                compartment(example, [ "example.cc" ])
                                compartment(example2, [ "example2.cc" ])
                                
                                # And now we want to assemble it into a firmware image, with a load of other metadata.
                                firmware(myDeviceFirmware,
                                         [ helpers, example, example2 ],
                                         "threads" = [
                                           {
                                             stackSize : 0x400,
                                             priority : 1,
                                             entryPoint : "entry",
                                             compartment : "example"
                                           },
                                           {
                                             stackSize : 0x200,
                                             priority : 32,
                                             entryPoint : "entry_point",
                                             compartment : "example2"
                                           }
                                         ])
                                

                                With xmake, I’m pretty close to this. I’m not sure that I could get Build2 to this state without a lot of custom C++ (and, even though C++ is normally my go-to language, I’d refer a scripting language like Lua for this. I’d prefer a strongly typed scripting language where files and paths were first-class types even more).

                                1. 1

                                  I believe Build2 would require us to write C++ code for all of these. Lua is a much nicer choice.

                                  Not necessarily. While build2’s language is probably not as powerful as Lua (yet), we do provide quite a bit of string/path/regex functions. It also has types (bool, [u]int64, string, [dir_]path) and lists of those. Things like arithmetic is a bit ugly at the moment, but doable. There is also a lot of things inside that are not yet exposed to the buildfile language but could be should there be a need. For example, internally we have sets, key-value maps, etc (but all those things are available to build system modules written in C++, so you can define a variable of, say, type std::map<std::string, std::string> and users will be able to access/modify it from the buildfile).

                                  Second, and this might be my misunderstanding, Build2 feels a lot more bottom-up in its approach. Part of this means that it’s hard for me to see how you do context-dependent builds. For example, when we compile a C or C++ source file, it needs to know the name of the library or compartment that it will end up in and so we don’t want a generic .c / .cc rule, we want to build compile rules for each target. This is exactly the structure that xmake exposes as a compositional abstraction: there are compile rules, but they expose extension points per target. We need to change how a file is compiled depending on the kind of target that it’s ending up with and I couldn’t find anything in Build2 that looked vaguely this shape.

                                  This would be pretty easy to do manually (just mark each source file with a target-specific variable that contains the compartment) but doing this sort of back-propagation automatically would require implementing your rules in C++ (where this is definitely doable).

                                  Another idea that we are considering is a macro facility. This would likely allow you to approximate your desired language of “magic incantations” pretty closely. But I am still on the fence about this.

                                  I’m not sure that I could get Build2 to this state without a lot of custom C++

                                  I think it would be interesting to try to prototype something without C++ (that is, using ad hoc pattern rules) and see how close we can get. I would be happy to sketch something out, just need to get a better understanding of the build steps/outputs.

                                  But I think in the long run, what you are trying to do (i.e., polished build system support for an unusual development environment) is a great fit for writing your rules in C++ and packaging them as the build system module. There will be few things that you won’t be able to handle optimally.

                                  1. 1

                                    Thanks for the details. Some of your understanding about build2 is correct but there is also quite a bit that is off (likely due to the lack of documentation). I am going to respond and clarify those points but before (or while) I do that, could you sketch the command lines that would be used to build the example you have shown if I were to do it manually from the shell? I want to see if I can sketch a prototype based on that.

                                    1. 1

                                      Compiling a C/C++ file is more or less the same as a normal build (in xmake and cmake, we use their existing infrastructure), but we need to add a flag that tells it which compartment it’s in. In the xmake, we default this to the name of the target that it’s being compiled for. In CMake we have an add_compartment function that takes the name of the compartment as an argument and will then create a real target and set the properties on it.

                                      We then we do an initial linking step for compartment step. This uses a custom linker script and produces a .compartment / .library file, linked with an extra linker flag telling it that it’s linking a compartment (and so doesn’t do a complete link).

                                      The final firmware link step combines a load of compartments / libraries and a couple of .o files that we’ve built separately. There are a few things that make this fun:

                                      • When we define a firmware target, we specify the threads. These are used to create some -D arguments to the compiler for a couple of other targets. In CMake, we add those targets in the add_firmware function, in xmake we modify the targets in the after_load step of the rule that the firmware uses.
                                      • The final firmware link uses a linker script that is generated from the set of other targets, so needs to do some substitutions on a text file with a load of string processing.

                                      Aside from that, it’s a fairly normal linker invocation. The main thing is that we need to communicate the threads to this step.

                                      1. 1

                                        Thanks, I think I am starting to get the picture. A few clarifying questions:

                                        When we define a firmware target, we specify the threads. These are used to create some -D arguments to the compiler for a couple of other targets.

                                        I assume those couple of targets are from the corresponding compartment?

                                        Generally, it looks like there is a 1:1 relationship between threads and compartments. If that’s correct, would it be more natural to specify the thread information on the compartment, especially seeing that you need it when compiling compartment’s source code. Something along these lines:

                                        compartment(example,
                                                    [ "example.cc" ],
                                                    "thread" = {
                                                        stackSize : 0x400,
                                                        priority : 1,
                                                        entryPoint : "entry",
                                                    })
                                        
                                        compartment(example2,
                                                    [ "example2.cc" ],
                                                    "thread" = {
                                                        stackSize : 0x200,
                                                        priority : 32,
                                                        entryPoint : "entry_point",
                                                    })
                                        
                                        firmware(myDeviceFirmware, [ helpers, example, example2 ])
                                        

                                        Or am I missing some details here?

                                        1. 1

                                          No, threads are orthogonal to compartments. A compartment defines code and globals, a thread is a scheduled entity that owns a thread and can invoke compartments. Each thread starts executing in one compartment but can invoke others. Two threads can start in the same compartment.

                                          A firmware image is built out of a set of compartments and libraries (libraries do not own state and so provide code that can be simultaneously in multiple security contexts), a few core components, and a set of threads.

                                          We specialise the loader and scheduler with the definitions of the threads. We don’t allow dynamic thread creation (we’re targeting systems with 64-512 KiB of RAM) and so we want to pre-allocate all of the data structures for the thread state.

                                          1. 1

                                            Ok, this makes sense but then I am confused by your earlier statement:

                                            When we define a firmware target, we specify the threads. These are used to create some -D arguments to the compiler for a couple of other targets.

                                            If there are only compartments and libraries and both are independent of threads, which targets does this statement refer to?

                                            1. 1

                                              The scheduler and loader. In the CMake and xmake versions each of these are separate targets. The scheduler is built like a compartment (it’s basically untrusted). The loader is just built as a pair of .o files (assembly stub that calls into C++ to do most of the work).

                                              To clarify: the compartments provided by the user are independent of the threads, the scheduler is not (threads cannot start in the scheduler, so there’s no circular dependency here). Users don’t provide the targets for the scheduler and loader, they are created by the build system for the firmware. They don’t need to be separate targets if there’s a way of doing it better in build2.

                                            2. 1

                                              Ok, here is my initial take: https://github.com/build2/cherimcu

                                              It only uses Buildscript rules (no C++) but doesn’t yet cover threads (still waiting on some clarifications in the sibling reply). Let me know if this looks potentially interesting in which case I will develop it a bit further.

                                              1. 1

                                                Thanks. I think it’s a bit harder to use than the xmake currently. This is my writeup of using xmake for the same project. The snipped listed there is the user code to create our simple tutorial example that has two compartments that communicate, allocate and free some memory, and talk to a UART.

                                                Your example has this:

                                                compart{example}: objc{example}: cxx{example.cc}
                                                compart{example}: objc{details}: cxx{details.cc}
                                                objc{example details}: compartment = example
                                                

                                                I think the last line is telling setting the compartment name (it took me a while to realise that objc meant ‘object file for a compartment’ and not ‘Objective-C’: see my previous comment about the terse syntax of build2). I guess the lines above are a chain of rules, you compile a C++ file, you get an object file(? or is this doing something special) and link them into a compartment. If I forget the last line, then there’s a dynamic check in the rule.

                                                In contrast, the xmake version is correct by construction:

                                                compartment("example")
                                                    add_files("example.cc", "details.cc")
                                                

                                                You can’t do anything with the files unless you add them to a compartment (or a library).

                                                The xmake interface for specifying the threads is less nice (and I think build2 might be nicer here). These two arrays of objects are used to set three pre-defined macros:

                                                • The number of threads
                                                • A C++ array literal with the sizes of the stacks and trusted stacks.
                                                • A C++ array literal with the names of the compartment entry points (mangled, but with an encoding which is basically {prefix}{identifier length}{identifier length}{suffix}.

                                                I actually like the xmake version of this but I have a lot more confidence in build2 than xmake as a project that will be maintained long-term for users to depend on.

                                                1. 1

                                                  I think the last line is telling setting the compartment name. I guess the lines above are a chain of rules, you compile a C++ file, you get an object file(? or is this doing something special) and link them into a compartment. If I forget the last line, then there’s a dynamic check in the rule.

                                                  Yes, that’s all correct. At the core this is (improved) make with targets, prerequisites, patter rules, etc.

                                                  it took me a while to realise that objc meant ‘object file for a compartment’ and not ‘Objective-C’: see my previous comment about the terse syntax of build2

                                                  Yeah, good point. Those are actually all custom names (in build/cherimcu.build) and we can rename them to something like compartment_object_file if you prefer ;-) (but also see the next point).

                                                  In contrast, the xmake version is correct by construction

                                                  I don’t know if you’ve noticed at the end of that example I’ve shown what it could look like if we re-implemented the compartment/firmware linking rules in C++ which would allow us to synthesize intermediate dependencies (so we don’t need to manually spell out obj*{} stuff) and back-propagate some variables (so we don’t need to manually set the compartment name):

                                                  compart{example}: cxx{example.cc details.cc}
                                                  

                                                  The xmake interface for specifying the threads is less nice (and I think build2 might be nicer here)

                                                  My current idea is to represent threads as non-file-based targets (similar to PHONY targets in make) which will allow us to “connect” them (via the dependency relationships) to multiple things, namely the firmware and the scheduler/loader:

                                                  firmware{mydevice}: compart{example example2} library{helpers} thread{example example2 uart}
                                                  
                                                  thread{example}:
                                                  {
                                                    compartment = "example",
                                                    priority = 1,
                                                    entry_point = "entry_point",
                                                    stack_size = 0x400,
                                                    trusted_stack_frames = 2
                                                  }
                                                  
                                                  thread{example2}:
                                                  {
                                                    compartment = "example2",
                                                    priority = 2,
                                                    entry_point = "entry_point",
                                                    stack_size = 0x400,
                                                    trusted_stack_frames = 2
                                                  }
                                                  
                                                  thread{uart}:
                                                  {
                                                    compartment = "uart",
                                                    priority = 31,
                                                    entry_point = "entry_point",
                                                    stack_size = 0x400,
                                                    trusted_stack_frames = 2
                                                  }
                                                  
                                                  1. 1

                                                    I don’t know if you’ve noticed at the end of that example I’ve shown what it could look like if we re-implemented the compartment/firmware linking rules in C++ which would allow us to synthesize intermediate dependencies (so we don’t need to manually spell out obj*{} stuff) and back-propagate some variables (so we don’t need to manually set the compartment name):

                                                    compart{example}: cxx{example.cc details.cc}

                                                    That looks like the kind of thing that I’d like to end up with. Is the ability to dynamically create targets the only thing missing to be able to do this in the script? xmake also has this limitation, which I work around by having a description-scope function called firmware that actually expands to the definition of three targets (the scheduler, the loader, and the firmware that depends on the first two). This is a bit of a hack but it does give the UI that I want.

                                                    My current idea is to represent threads as non-file-based targets (similar to PHONY targets in make) which will allow us to “connect” them (via the dependency relationships) to multiple things, namely the firmware and the scheduler/loader:

                                                    Yes, that’s the sort of shape that I’d like. I don’t really like the repetition. Why do I need to tell the firmware build rule which targets are compartments, which are threads, and which are libraries? Don’t the targets know this already? Or is this because the rules define separate namespaces (which, I guess, is how I can have a compartment and a thread both called example)?

                                                    1. 1

                                                      Is the ability to dynamically create targets the only thing missing to be able to do this in the script?

                                                      Yes, that plus the ability to back-propagate values. Generally, with a C++ rule you can access the underlying build model directly and there is very little limitation about what kind of “synthesis” you can do (as long as you keep things race-free). With script rules we just expose the most generally applicable functionality.

                                                      If you think this looks promising, I can add the C++ rule prototype (we can start directly in cherimcu.build and see if we want to factor this to a build system module later).

                                                      Why do I need to tell the firmware build rule which targets are compartments, which are threads, and which are libraries? Don’t the targets know this already? Or is this because the rules define separate namespaces […]

                                                      Yes, in build2 the target identity is the directory, type, and name, where type is the abstraction of file extension. So path /tmp/foo.1 in build2 becomes target /tmp/man1{foo} (here man1 is the target type). Besides helping with the “what is the extension of an executable” type of problems, this also allows us to have non-file-based targets without having to resort to hacks like the .PHONY marker in make.

                                                      1. 1

                                                        If you think this looks promising, I can add the C++ rule prototype (we can start directly in cherimcu.build and see if we want to factor this to a build system module later).

                                                        That would be very interesting.

                                                        The things that I’m currently looking at adding to the xmake is separating out SoC descriptions. These need to be able to specify compile flags that are propagated to all of the targets in a dependency chain and some things describing the memory map to go into the final linker script. I could make these JSON, but the lack of hex encoding for integers is a bit annoying there since they’re full of memory addresses. I could make them Lua literals for xmake - is there some nice alternative for build2?

                                                        1. 1

                                                          That would be very interesting.

                                                          Ok, I will try to find some time in the next couple of days.

                                                          I could make these JSON, but the lack of hex encoding for integers is a bit annoying there since they’re full of memory addresses. I could make them Lua literals for xmake - is there some nice alternative for build2?

                                                          We have int64 and uint64 types and while currently there is no way to specify their values in hex, that would be trivial to add.

                                                          But I am wondering why you need them to be recognized as integers at all if they are just being passed along. Unless you are doing some arithmetic on them?

                                                          1. 1

                                                            But I am wondering why you need them to be recognized as integers at all if they are just being passed along. Unless you are doing some arithmetic on them?

                                                            I’d like, at the very least, to do some sanity checking for them, I’d also like to be able to express MMIO regions as either (base, top) or (base, length) and translate between them.

                                                            Note that they don’t actually need to be 64-bit values for us, we have a 32-bit address space (which I really want to reduce to 28 bits).

                                                            1. 1

                                                              Ok, I’ve done another pass over the prototype and it now includes thread information and loader/scheduler generation: https://github.com/build2/cherimcu/

                                                              Note that I haven’t gone with the C++ implementation yet. Instead, I’ve decided to see how far I can take it with script rules by adding some missing features to build2 (like the hex notation for integers). So if you want to try it, then you will need to use the staged version of build2 until 0.16.0 is out in a couple of months: https://build2.org/community.xhtml#stage

                                                              You’ve mentioned that you would prefer for loader/scheduler not to be separate targets. I’ve done it this way (they are generated on the fly during the firmware linking) but I think ideally you would want to make them separate synthesized targets so that you avoid unnecessarily recompiling them if the relevant information hasn’t changed. But that, again, would only be possible with the C++ implementation.

                                                              Another thing I would like to point out is that this implementation does fairly accurate change tracking (which is one of the main design goals of build2). For example, if you change the thread stack size, then the relevant (and only the relevant) parts will be updated.

                                                              Let me know what you think. I would still like to add the C++ version, time permitting.

                                                              1. 1

                                                                Thanks, that looks very nice. The xmake version is now at a state that I’m happy with. The maintainer just added support for dynamically creating targets for us, which lets us clone a default loader target and add defines from the board description file. Since my last message, we’ve added board descriptions, which are JSON files containing the memory map for the target (used to generate the linker script) and macro definitions that must be set from the board. Xmake has built-in support for parsing JSON, so the user can just specify either a path to their own board definition or the name of one that we ship and we can pull all of these things out. Dynamic target creation means that we can have a single build file that builds the same firmware for two boards, without conflicts. I think that isn’t necessary in your approach because we don’t share the loader target between different firmware images in your version.

                                                                For thread descriptions, we need to product a pair of predefined macros containing initialiser lists that each contain a subset of the information (with some permutations: we transform the compartment name and thread entry point into the mangled name of our export table symbol). I think these would probably be easier in C++.

                                                                I hope that we will open source in the next few weeks. I have a bit more confidence in Build2 as a long-term solution and I bet I could extend the xmake build system to produce the build2 files for most cases (everything that you need is available for introspection in xmake), so there’s a nice migration path.

                                                                Recompiling the loader doesn’t bother me too much. It is one of our slowest files to build (it uses C++ templates to do a lot of compile time checks for correctness) but even then it’s only about a second and a complete clean build is well under 2 seconds for a fairly complex example (one where we start to worry about running out of SRAM for code). I’m much more worried about reproducible builds than speed, and I think Build2 has a strong story there.

                                                                1. 1

                                                                  Xmake has built-in support for parsing JSON […]

                                                                  I can see what you are doing here ;-).

                                                                  Seriously, though, while we don’t have the buildfile-level json type yet, we do have the built-in JSON parser/serializer available to C++-based implementations. So a rule written in C++ could load a JSON file.

                                                                  Another option would be to ditch JSON and just represent this data as a buildfile target, similar to thread, provided it’s not too deeply structured. This way you get the type-checked hex notation for integers. Also, if you are using xmake as the meta build system, I am sure you could generate these from JSON.

                                                                  For thread descriptions, we need to product a pair of predefined macros containing initialiser lists that each contain a subset of the information (with some permutations: we transform the compartment name and thread entry point into the mangled name of our export table symbol). I think these would probably be easier in C++.

                                                                  I think you should be able to achieve this even with the script rules. Currently they just generate translation units with this information spliced in. Producing instead a list of macro definitions doesn’t feel like a major leap in complexity. Any reason you don’t just generate a header or source file instead of wrangling with escaping macro values on the command line?

                                                                  1. 1

                                                                    Seriously, though, while we don’t have the buildfile-level json type yet, we do have the built-in JSON parser/serializer available to C++-based implementations. So a rule written in C++ could load a JSON file.

                                                                    A big part of the reason that we wanted JSON is that xmake and CMake can both parse it, so these files can be completely independent of the build system. If xmake turns out to be a mistake, we can teach CMake to consume these files and do the right thing. I assumed that, even if build2 didn’t have native support, then linking something like nlohmann/json into a C++ plugin would be pretty trivial (and not cost us much since we’d probably want one anyway).

                                                                    I think you should be able to achieve this even with the script rules. Currently they just generate translation units with this information spliced in. Producing instead a list of macro definitions doesn’t feel like a major leap in complexity. Any reason you don’t just generate a header or source file instead of wrangling with escaping macro values on the command line?

                                                                    We’d need a different header file per firmware image and so we’d have to pass a command-line parameter telling it where to find the file. We could do that, if it were significantly easier, but you don’t really need much escaping: -D"..." or "-D..." both work fine with clang (and gcc, though we only have LLVM support at the moment), and xmake handles the escaping for me anyway so all I need to do is create a string and pass it to xmake. I’m a bit surprised if this is hard with build2: correctly escaping command-line arguments seems like a pretty essential feature for a build system (and mostly doesn’t matter if the build system isn’t invoking tools via a shell, since execve takes an array of arguments, not an escaped string).

                                                                    To give you a concrete example, this is what one of our examples generates:

                                                                    -DCONFIG_THREADS={{1,1,1024,2},{2,2,1024,2},{3,31,1024,2},} "-DCONFIG_THREADS_ENTRYPOINTS={la_abs(__export_example__Z11entry_pointv),la_abs(__export_example2__Z11entry_pointv),la_abs(__export_uart__Z11entry_pointv),}" -DCONFIG_THREADS_NUM=3
                                                                    

                                                                    The first one is used by the loader setting up the threads and contains the thread numbers (monotonic integers, we don’t actually need these, since they are the index), thread priorities (used by the scheduler to set up initial thread state), stack size (used for the loader to allocate the stack), and trusted stack depth (used by the loader to allocate the trusted stack). The second of these is the macro used to emit a relocation for an absolute address (not a capability) for a global. This is generated from the compartment names (example, example2, and uart) and the entry point names (all of them are called entry_point, because naming things is hard). These are used by the loader to set up the initial program counter and global capabilities for the threads. The last one is the number of threads. The loader and the scheduler both need different subsets of this information (we should probably split it a bit better at some point, but it’s not urgent).

                                                                    We use the first of these in a constexpr array, which lets us do fun things like add up the total space required for all stacks, trusted stacks, and register-save areas (using sizeof on some structures in C++) in a constexpr function that is then used to define a global that has this much space, in a special section so that our linker script puts it in the right place. Unfortunately, la_abs is not constexpr currently and so we split the entry points into a separate array.

                                                                    CONFIG_THREADS_NUM is mostly used in assembly, we just have a static_assert in C++ that it matches the sizes of the other two arrays.

                                                                    I’m not sure that putting these in a header file would save us any complexity and it would add some (we had a header file when we wrote these by hand, removing it simplified the code).

                                                                    1. 1

                                                                      I assumed that, even if build2 didn’t have native support, then linking something like nlohmann/json into a C++ plugin would be pretty trivial (and not cost us much since we’d probably want one anyway).

                                                                      Theoretically, yes, though for now we recommend that build system modules don’t have any external dependencies for robustness reasons (imagine if two different build system modules require different versions of nlohmann-json and some poor user ended up using both in the same build).

                                                                      I’m a bit surprised if this is hard with build2: correctly escaping command-line arguments seems like a pretty essential feature for a build system […].

                                                                      Yes, escaping/quoting of arguments when calling exec or equivalent is of course handled automatically. I think I was remembering all the cases where we wanted to pass a string literal as a macro:

                                                                      cxx.poptions += -DBUILD2_INSTALL_LIB=\"$regex.replace($install.resolve($install.lib), '\\', '\\\\')\"
                                                                      

                                                                      But your macros look fairly benign. For comparison, this is what my prototype generates:

                                                                      #include <cstdint>
                                                                      
                                                                      struct thread
                                                                      {
                                                                        const char*    compartment;
                                                                        std::uint64_t  priority;
                                                                        const char*  (*entry_point) ();
                                                                        std::uint64_t  stack_size;
                                                                        std::uint64_t  trusted_stack_frames;
                                                                      };
                                                                      
                                                                      extern "C" const char* entry_point ();
                                                                      extern "C" const char* entry_point2 ();
                                                                      
                                                                      thread threads[2] = {
                                                                        {"example", 1, &entry_point, 0x00000400, 2},
                                                                        {"example2", 2, &entry_point2, 0x00004000, 3},
                                                                      };
                                                                      
                                          2. 1

                                            Can you expand why you ended up with this design? It seems to be a circular dependency: A thread depends on the code execute in it and through your “-D arguments” the code depends on the thread.

                                            I can imagine something like “the code needs to know if it gets called every 10ms or 50ms to measure time”. In our projects (proprietary automotive), we avoid such code and rather pay for the overhead of computing durations at runtime.

                                            1. 1

                                              There’s no circular dependency. Threads depend on the compartment that they start in, the scheduler depends on knowing the number of threads, the loader depends on knowing the number of threads, their entry points, and the sizes of their stacks. We could remove the loader’s dependency here and make it dynamic, but we get a bit better code density from compile-time specialisation (not very important:the loader erases itself after it runs and returns the memory to the heap allocator, so there’s little need to make the loader small), but we want to specialise the scheduler’s data structures with the number of threads and the number of priority levels used.

                                  2. 1

                                    I’ve been evaluating bazel lately, because I’d like to untangle the mess that is the current build system in my team (which is a bunch of Jenkins jobs triggering other jobs and expecting them to write to a predefined path).

                                    My intuition is that I’d like something similar to nix flakes, defining the builds for artifacts for each repo, and then having a way to depend on other repos artifacts, with some lockfile-like version management.

                                    My intuition also tells me I don’t want to rewrite the entire build process to be inside of nix (which would also make my application require nix at runtime).

                                    I get the feeling from this that bazel is not a great fit. Because I’ll have the same problem with rewriting everything, and because it seems very oriented toward monorepo, and using it in a multi repo setup seems to go against the grain.

                                    1. 1

                                      which would also make my application require nix at runtime

                                      Would it? IME you can tarball a nix closure and unpack it into a chroot and the programs run just fine without any of the nix command line tools.

                                      1. 2

                                        Mmmm, true, what I said doesn’t hold in general.

                                        I’ll have to try it out ti figure exactly why (if) it doesn’t work. The current process creates a python virtualenv and installs dependencies in it, and might also depend on some installed Ubuntu packages. The only way to make it produce an artifact without replacing half of this with nix packages is to run nix in impure mode (but that might be a solution).

                                        1. 1

                                          I was assuming you completely nix-ify the whole build, so your output is a nix derivation. Then taking the closure of that should work, assuming it refers to all its dependencies directly and never e.g. via $PATH.

                                    2. 1

                                      I think that Bazel is great and I’m using it on a team of <10 developers to manage a Python monorepo. One thing that’s incredibly undervalued about Bazel is “one command to rule them all”. There’s no bespoke bunch of different scripts you need to run, no smattering of Makefiles and bash scripts. It’s just “bazel”. That’s what attracted me to Bazel in the first place more than it’s hermetic and reproducible properties. With a single “bazel run :deploy” command you can deploy any of our services, it will grab the necessary deps, package them in a container, and deploy them out via IaC.

                                      I evaluated quite a few other tools which could provide that “one command to rule them all” experience. Make is the tried and true, but very rapidly becomes difficult to manage for larger projects, especially in a monorepo. Buck and Bazel effectively share a lineage, but Bazel has more traction. Pants can’t seem to figure out what it wants to be and is a massive paint to work with in my experience.

                                      Overall, I wholeheartedly recommend Bazel to any team building in a monorepo.

                                      1. 1

                                        One issue is ossification. I have seen Bazel setups which are inflexible and cannot be forward-ported to updated architectures. Properly done hermetic Bazel setups are extremely tough to change; it’s not like Nix or Portage, where bumping hashes is enough to do updates.