1. 34
  1.  

  2. 5

    If you’ve read this far and are thinking “If Redo is so great, how come it hasn’t taken over?”

    I want a build system that can help me with the following:

    • build variants – e.g. don’t do “make clean” between debug / release / ASAN builds. GNU make has some tortured support for this, but it requires you to use it correctly, and I haven’t seen many people do it (I think it’s VPATH or something). AFAIK redo doesn’t help you with this any more than GNU make does – probably less.
    • Writing correct incremental builds, e.g. specifying all your dependencies. I don’t want a tradeoff between fast builds and correct builds. GNU make doesn’t help you with this enough, and neither does redo AFAIK.
    • Correct parallel builds. I think apenwarr’s redo does better here, but the original redo doesn’t. But I haven’t seen a real convincing demonstration of that in any case. I think the declarative graph model helps you analyze bottlenecks in your build (critical paths), but redo’s non-declarative model may thwart that to some extent.

    In other words the whole point of a build system is to let you write something that’s fast and correct (which is not easy), and neither GNU make or redo helps enough with that.


    FWIW I sort of made fun of the Android build system circa 2015 here for using GNU Make (they no longer use it):

    https://lobste.rs/s/znrsap/tech_notes_success_failure_ninja#c_l04v6d

    ButI didn’t mention that it was a very parallelizable build. It was much more parallelizable than Firefox’s build as I recall. Android was building much more but it could saturate 32 cores easily. Other build systems do not do this – you have to try to get a really parallelizable build, and that’s what I want a build system to help me with.

    As I mentioned there, I’m probably going to switch the build to Ninja + a shell or Python generator. This recent post was very cool

    https://lobste.rs/s/952gdv/merkle_trees_build_systems

    I think that architecture has a lot of potential for hard build problems in the future. Some context on Oil’s problem here: https://github.com/oilshell/oil/issues/756


    In other words I think redo is fine and cool but it doesn’t help enough with hard build problems. You could make an analogy to Forth. Sure it’s more elegant for certain problems, but if it falls down for harder problems, then it’s not a net win. It has its place, but it’s not surprising that it’s not more popular.

    1. 3

      build variants – e.g. don’t do “make clean” between debug / release / ASAN builds. GNU make has some tortured support for this, but it requires you to use it correctly, and I haven’t seen many people do it (I think it’s VPATH or something). AFAIK redo doesn’t help you with this any more than GNU make does – probably less.

      The easiest way to do this is to have something like

      .PHONY: %.var
      .PRECIOUS: %.var
      %.var:
      	@echo "$($*)" | cmp -s - "$@" || echo "$($*)" > $@
      
      %.o: %.c CC.var CFLAGS.var
      	$(CC) $(CFLAGS) -c -o $@ $<
      

      This will 100% work, but it takes some discipline to specify all used variables as dependencies.

      Writing correct incremental builds, e.g. specifying all your dependencies. I don’t want a tradeoff between fast builds and correct builds. GNU make doesn’t help you with this enough, and neither does redo AFAIK.

      Yeah, to do something like this in make you have to do something like

      $(OBJS) := # ...
      $(DEPS) := $(OBJS:.o=.d)
      
      %.o: %.c
      	$(CC) $(CFLAGS) -MMD -c -o $@ $<
      
      -include $(DEPS)
      

      but of course this only works for C files. For a more comprehensive treatment of this problem, have a look at tup. It puts an overlay FUSE filesystem on your source directory to track what objects use what files. One downside may be that its limited syntax makes it less useful as an all-in-one config/build tool than make is. If you use tup, you may end up needing to use something like autotools/cmake/meson to generate your tupfiles.

      Correct parallel builds. I think apenwarr’s redo does better here, but the original redo doesn’t. But I haven’t seen a real convincing demonstration of that in any case. I think the declarative graph model helps you analyze bottlenecks in your build (critical paths), but redo’s non-declarative model may thwart that to some extent.

      I believe if you have proper dependency tracking, then this becomes trivial to do. Well-written makefiles tend to parallelize extremely well, but only if the author has included all the dependency info.

      1. 2

        Yeah, the short way to say it is that the build should fail if the Makefile is broken. Bazel has that property, and the linked Merkle tree post describes a system that appears to have it as well.

        In other words, the build tool should help you write a correct build config. GNU make doesn’t help you at all – instead you are left with a bunch of obscure rules that you’re not sure if you followed.

        You can state the rules, and maybe follow them yourself, but if I give you 5,000 lines of someone else’s Make, then good luck.

        On top of that, make can’t be statically parsed, so you can’t write a lint-type tool for it. I don’t think redo can be either. You can statically parse shell (which is what Oil does) but redo has semantics on top of shell, that occur after runtime substitutions.

        1. 2

          I tried tup, and liked it initially, but was ultimately put off by it. I don’t like that it requires a fuse module (heavy dependency, doesn’t work well on platforms that aren’t linux, kind of hacky). It also broke its promise to never screw up a dirty build.

          I don’t know what the solution is for projects that require complicated builds. I try to keep mine simple—list of source files converted to objects—so plain make can handle it, but sometimes that’s just not practical. Meson is decent, but not great. Cmake and autotools are hell. Language-specific build systems (cargo, dub, …) tend to be good, but inflexible.

          1. 1

            I tried tup, and liked it initially, but was ultimately put off by it. I don’t like that it requires a fuse module (heavy dependency, doesn’t work well on platforms that aren’t linux, kind of hacky). It also broke its promise to never screw up a dirty build.

            Yeah, I primarily mention it because it’s sort of the natural end-point of something like redo. I wonder if one could do something like this with ptrace. Does windows have an api for that? I wonder what the performance is like when compared to fuse.

            Cmake and autotools are hell.

            I agree. CMake does make windows builds better, but at the cost of its awful configuration language. I think there’s an urge to reach for a tool like them whenever one starts to have significant configuration for a project. I think Kconfig does a pretty good job, now if only it could get recursive dependencies :)

        2. 2

          but redo’s non-declarative model may thwart that to some extent

          It is non declarative, but after a full build, you get .dep.$name files that declare the build tree, which redo-dot or others implementation’s tools can make use of.

          Writing correct incremental builds, e.g. specifying all your dependencies. I don’t want a tradeoff between fast builds and correct builds. GNU make doesn’t help you with this enough, and neither does redo AFAIK.

          I fail to see how either fails at doing so: as soon as the full dependency graph is there, it is a problem that is solved by both?

          1. 2

            It’s easy to introduce bugs in Makefiles simply by leaving off a dependency. You add a dependency to the code but forget to update the Makefile. (the gcc -M stuff helps for C/C++ code only, though it’s not exactly a great mechanism)

            So the build still works in the sense that you’ll get some output file and will exit status 0, but the incremental build is subtly broken, and may remain that way for a long time.

            What I’m thinking of is more along the lines of what’s described here:

            https://lobste.rs/s/952gdv/merkle_trees_build_systems

            • Using something like bubblewrap to run build steps in a deterministic environment / lightweight container. If you leave off a dependency, you get a build error – file not found. Bazel does this and it’s very effective at enforcing the correctness of the build.
              • For speed, there could be an option to run without containers / chroots. But for changing the build description, the enforcement is very valuable.
            • Directory trees, not files, as dependencies. For many problems I suspect this coarse-grained solution may work better. Checking every file is tedious and causes a lot of stat() traffic on the file system.
            1. 1

              So not only about reproducing related files according to logical rules, but the entire directory tree with hashing to reproduce a filesystem image reproducibly (which might be a software project working directory), then provide a hash of the whole build.

              Sure, then Make and redo might be used for doing this, but it is not their doing nor default behavior.

              On a side note, redo tracks changes in dependencies with hashes put in .dep.$filename, but not of a whole directory hierarchy and just of the individual files.

          2. 1

            I’ve been using CMake for the past year and a half exactly because it checks all the boxes. I recommend using the Ninja generator alongside it - it’s parallel by default and significantly faster than Make.

            The one drawback of CMake is that you essentially have to treat the build system as its own part of the software that needs maintenance. I believe this tradeoff is well worth for larger projects.

          3. 3

            I have also used redo recently for a research project of mine. It essentially served a high-level cache for me to save intermediate computations. What usually is done (and what I did before) was to do the computation and then write the result to disk. Then either load it from disk or recompute if you changed the experiment. Using redo here allows you to essentially automate that and not having to track whether I need to recompute or not.

            Using this setup was quite pleasant, although not as fricitonless as I had hoped. I’m unsure whether I’ll continue using it afterwards or not. It definitely was great to be able to always have up-to-date figures generated from the data and allowed me to carry out somewhat involved experiments. But in the end this approach is definitely not standard so it might become an issue down the road, either due to fricition in cooperation or some other unforseen problem that is not easily solvable with redo.

            1. 3

              I briefly saw a very small mention of writing build scripts in another language than shell (eg Python) but found no examples. Are there docs on this?

              1. 6

                You just need to invoke the same commands, like redo-ifchange etc using the languages subprocess facilities.

                1. 2

                  Huh. For some reason that is simpler than I expected.

              2. 2

                What is the best redo implementation at this point? Is it still apenwarr’s redo?

                1. 2

                  Fareed, out of curiosity, you mention The Little Schemer but link to The Little Lisper. I’m curious which one it was since you mention such a positive boost in satisfaction and insight. Thanks!

                  1. 5

                    They are the ‘same’ book. The Little Lisper is the first edition and it has code to follow along in Common Lisp as well. The following editions are under the name little Schemer and are scheme only. Although they are more of a pen and paper book tbh. So get the little schemer. fwiw I enjoyed the reasoned schemer a lot as well.

                    1. 1

                      You’re right, but The Little Lisper (3rd edition) has some “homework assignments” that are missing from the Little Schemer. This makes the former a slightly better book, IMO, even though the Little Schemer’s paper and print quality are better. For the life of me I can’t understand why they decided to drop the exercises from the newer book.

                      You mention the Reasoned Schemer which is great in its own way, but a very different book because it’s not about “standard” programming. On the other hand, the Seasoned Schemer basically picks up where the Little Schemer left off, and it goes into continuations if I’m not mistaken. Highly recommended if you enjoyed the first book!

                    2. 2

                      I’ve only read The Little Schemer, but it was a real joy to read. It hooked me in right from the start and was very easy to follow. You don’t even really need a computer to go through it, I actually wrote all my answers on paper which was fun (for me at least). As a bonus, it got me interested in Scheme!

                    3. 1

                      If you’ve read this far and are thinking “If Redo is so great, how come it hasn’t taken over?”

                      I wrote my own implementation, tested it by converting several large OSS codebases, and even made a leaderless distribution system atop it. I was a true believer. I am no longer.

                      IMHO redo hasn’t taken over for two reasons:

                      1. Multiple outputs do not work; the workarounds are hacks.
                      2. The build graph can’t be walked concurrently. redo uses a global lock.

                      If redo inspires, ejholmes/walk might too. There’s a big design space.

                      1. 1

                        Do you consider walk an improvement over redo, and if so: how?

                        1. 1

                          I consider it a different place in the design space; it’s near to make and redo. It’s far from Bazel. Ninja is probably in-between.

                          1. 1

                            The only real difference I found was the two stages for building dependencies (deps) and building (exec). I don’t know what kind of advantage comes from that or whether that is worthwhile.