1. 10
  1.  

    1. 16

      All the compiler outputs and targets are generated files; build systems have solved this kind of problem for a long time.

      Use a build system, and build your generated file when the inputs change.

      1. 3

        And that way, is possible to test your build system locally, without waiting minutes for each iteration.

      2. 2

        Now we just need a build system that doesn’t completely suck for one reason or another. 🙃 I actually agree that build systems are the way to go in theory, but they often suck so badly (difficult to use, inconsistent quality across languages, not actually hermetic, etc) as to be unjustifiable except in large organizations.

      3. 2

        You also need to rebuild when the generator itself or its dependencies change. Most build systems don’t seem to solve that problem very well.

    2. 5

      I endorse this general approach: not always, but often, cleanly generated files in the source root are easier to work with than unreadable garbage tucked in a corner of a build dir.

      Additionally, for libraries, this approach can often massively simplify life for the consumers, as they don’t need to depend on code used to generate stuff.

      However, I would recommend a different implementation: instead of writing GitHub action, write this as a test. Specifically:

      • generate source file in a test
      • compare with what’s on disk
      • if the contents is the same, do nothing
      • otherwise, update the file on disk, and then fail the test.

      Benefits of this approach:

      • piggy backs on the standard go test flow instead of adding a project-specific concern
      • independent of a specific CI provider and works locally
      • doesn’t need git in PATH
      1. 5

        I second this. My default stance is to git ignore generated files, but in some cases it’s the pragmatic approach to commit them.

        Some rule of thumb I use:

        • The file is infrequently changed, to not pollute most commits diff.
        • The file is small and readable enough to be reviewed. (no minified code, no binaries, etc)
        • The generation require some extra tooling not all devs have (e.g. ragel).
        • The file is critical enough that you’d rather clearly see during review that it changed.
        1. 8

          The file is infrequently changed, to not pollute most commits diff.

          You can use .gitattributes to disable local diffing for a file, and to tell git forges that a file is generated:

          src/gen/foo.c -diff linguist-generated

          With -diff, git diff will just remark something like “Binary file has changed”, and linguist-generated tells GitHub and Gitea to exclude a file from web diffs and from stats.

          1. 2

            Oh that’s a very useful tip, thank you.

          2. 1

            Didn’t know about linguist-generated, nice.

            Also, I often use binary instead of -diff to also have benefits during merge.

      2. 2

        Couldn’t you easily end up with a “wedged” build? Say I changed the input to the code generator, for example, I’ve added a data member to a protobuf spec, and adjusted my code to reference the new member. Now my project doesn’t build because the generated code doesn’t yet have this new member. But I also cannot run the test to re-generated it because the project doesn’t build. I need to remember to change the spec, run tests, only then change the code. This screams “hack”!

        Feel like the proper way to handle this is to use a proper build system. We do something similar to what you have described (keep pre-generated code up-to-date, fail the build if anything changes) for a tool that uses its own generated code: https://git.codesynthesis.com/cgit/cli/cli/tree/cli/cli/buildfile#n98

        1. 1

          You indeed could get a broken build, but it’s roughly equivalent in frequency and annoyance to, e.g., accidentally committing a file with merge conflict markers.

          If you add this to the default build task, which is run unconditionally, this has the problem that your consumers now have to run code generation as well just to build your stuff. It is a fairly frequent annoyance for Rust that a build.rs of my dependency has a boat load of dependencies, all of which could have been avoided if the author of the library would have run build.rs logic on their machine instead (which is not always possible, but occasionally is)

          If you add this as a dedicated make generate task, then the you’ll need to adjust you CI as well, which is a smell pointing out that local workflows need subtle adjustment as well.

          That’s why I suggest tacking onto make test: this is an already existing build system entry point for checking self-consistency of a particular commit.

          1. 3

            If you add this to the default build task, which is run unconditionally, this has the problem that your consumers now have to run code generation as well just to build your stuff.

            Not if you only enable code generation for the development builds. Our setup is:

            1. Consumer builds use pre-generated code.

            2. Development builds update generated code if inputs change, compare the result with pre-generated, if there are difference, copy the generated code from output to source and fail the build.

            1. 1

              Oh, that’s curious! I don’t think I’ve ever seen distinction between “development” and “consumer” builds before!

              Is this a first class concept in build2, or is it just some custom config for this specific build file? Are there any docs about other uses-cases for this separation?

              1. 1

                I don’t think I’ve ever seen distinction between “development” and “consumer” builds before!

                Well there are Rust’s dev-dependencies. Though, IMHO, the semantics (or the name) is misguided – tests are not only to be run during development.

                Is this a first class concept in build2, or is it just some custom config for this specific build file?

                It is a first-class concept, though a pretty thin one: we simply reserved the config.<project>.develop variable to mean “development build”. In build2 we split the package manager into two tools: for consumption and for development. The latter by default configures projects with config.<project>.develop=true.

                Are there any docs about other uses-cases for this separation?

                There is the announcement of this feature: https://build2.org/release/0.14.0.xhtml#develop

                So far we have used it for pre-generated source code as well as to support additional documentation outputs that require additional/non-portable tools. For example, we can produce XHTML portably but to convert that to PDF requires html2ps and ps2pdf so we only enable this step (and require these tools) in the development builds.

                EDIT: Forgot to mention, you can also use config.<project>.develop (like any other configuration variable) for conditional dependencies:

                depends: html2ps ? ($config.cli.develop)
                
              2. 1

                Oh, that’s curious! I don’t think I’ve ever seen distinction between “development” and “consumer” builds before!

                Autotools has had this distinction, I guess basically forever, the result of a make dist can be thought of as a “consumer” build. You don’t need automake/autoconf to do the ./configure && make && make install dance, but it is likely you’ll need it during some point if you’re doing development on the project.

              3. 1

                That’s how Oils works too

                • the “dev build” always generates everything from source
                • The end user / “release” build is a tarball that builds with on only a C++ compiler and a shell. You don’t need Python or any tools written in Python to build it.

                I think there are some downsides in that I want people to open up the tarball and be able to contribute without necessarily getting git involved … But I think this is less common than it used to be. But not unheard of for one-off hacks.

                Thankfully all our generated code is readable and debuggable with normal tools

    3. 1

      this is why i use bazel for my hobby projects. i can be completely brainless when using python to generate c++ files, and it all works fine