1. 12
  1.  

  2. 2

    This is pretty much exactly the kind of thing I care about at $JOB, so I’ll definitely be looking into this in the new year.

    1. 2

      Does Debian keep an Archive of all Packages and Versions ever created? I remember trying to install a “historic” version of Debian, and I had to spend some time finding an installation medium.

      1. 2

        Debian does not keep an archive of older packages, which makes this tool problematic when using standard debian repos.

        However, in embedded systems, it’s typical to use an apt proxy that caches all upstream versions as they become available.

        I should make this known in the post.

        1. 2

          snapshot.debian.org keeps all versions released. To do reproducible builds, you end up with something like: deb http://snapshot.debian.org/archive/debian/20190126T211315Z/ buster main in your sources.list.

        2. 1

          I’ve had this problem before, but don’t you think the .NET dependency re-introduces the predictability / stability problem? It’s a circular build dependency.

          Since your operating system is now considered your application, you’d need the same predictability in your outputs.

          It uses .NET, so install using the following command. sudo dotnet tool install AptTool –tool-path /usr/local/bin/

          It’s analogous to tools like Chef trying to solve versioning problems on your base image, but then creating the same problem by depending on specific versions of Ruby and specific versions of Ruby packages.

          Whether this is a problem in practice depends on your requirements, but it’s never gone as long as the circular dependency exists.


          I’m sort of “starting at the bottom” with Oil, and this problem is why distros use shell so much in the first place! Shell has no runtime dependencies except libc and no build dependencies except the C compiler.

          But it looks like Guix is doing good work to eliminate this circular dependency in another way. On the face of it, it seems like it’s going “overboard” a bit. But I think it’s good work and I don’t see any other solution.

          https://guix.gnu.org/blog/2019/guix-reduces-bootstrap-seed-by-50/ (comments )

          https://lists.gnu.org/archive/html/info-gnu/2018-08/msg00006.html comments

          I’m still looking for PR to Oil to kick off the use of Guix or Nix. I got a couple PRs but they trailed off and didn’t appear to solve the problem (I welcome follow-up on those.)

          https://github.com/oilshell/oil/issues/513

          Under Help wanted I say I’ll accept any PR that makes build/dev.sh minimal work in Guix or Nix. All it does is build a Python extension module (under Python 2) and run several Python programs to generate code, so it’s dirt simple with zero weird dependencies.

          On Ubuntu it’s just the python-dev package, and the C compiler. Once that works, I imagine it should be to make other dependencies reproducible.

          The concrete problem I want to solve is that I’m on Ubuntu and other devs are on say Arch Linux, so we have slightly different versions of tools and libraries. My understanding is that Guix or Nix can solve that problem without creating a lot of circularity too. That is, I assume we won’t run into the problem that you need version X.0 of Guix on Ubuntu and version X.1 of Guix on Arch, etc.

          1. 1

            It’s a circular build dependency.

            Can you elaborate on this? I’m not entirely sure what you are saying here.

            creating the same problem by depending on specific versions of Ruby and specific versions of Ruby packages.

            Are you saying .NET is unpredictable? That just isn’t true. Sure, dotnet tool install AptTool will install the latest version, introducing “unpredictability”, but you can solve this easily by specifying --version in your build CI/CD (which is what I would absolutely recommend). If you install the same version of a .NET global tool, you will have the same binaries/versions/dependencies at all times, giving you the same predictable outputs. This problem isn’t even solvable when using any of the tools out there to build Debian/Ubuntu rootfs (debootstrap, multistrap, etc).

            edit: Also, apt-tool is a pure .NET package with no dependencies. It simply shells out to dpkg and apt.

            Again, I’m not sure what you mean by “circular build dependency”, as I’m not using the generated rootfs to build another apt-tool.

            And yes, predictable rootfs file systems are achievable via other distros (Gentoo, NixOS, Guix, etc), but apt-tool is the only solution for apt-based systems.

            1. 2

              apt-tool is the only solution for apt-based systems.

              daguerreotype seems to solve a subset of what apt-tool does (mainly without the multistrap functionality). It produces reproducible images. The official Debian docker images are generated with it.

              1. 2

                Interesting, I didn’t see that one!

                Here is the repo.

                It looks like it still lacks some features I need, such as multi-repos, pinning versions, and it is intended to be used with only the official Debian repos.

                Thanks for sharing!

              2. 1

                for deterministic and predictable root filesystems

                This tool is helpful to create rootfs for debian-based distributions with reproducable outputs

                Well I’m interested in clarification on what exactly you mean by deterministic, predictable, reproducible. What’s the claim? In what sense is debootstrap / Debian NOT reproducible / deterministic?

                I can think of a few reasons, but I’m curious what problems you encountered. I have worked with it on the server side, not for embedded systems. When I worked on it, package builds were not deterministic, but I think they fixed that problem: https://wiki.debian.org/ReproducibleBuilds

                However I think it’s probably true that debootstrap isn’t reproducible. But I haven’t used it in awhile and I’m curious what state it’s in. BTW I ran debootstrap with Oil shell: Success with Aboriginal, Alpine, and Debian Linux


                So do you mean it in the mathematical sense? Like these binaries are a deterministic function of these sources and tools?

                Now, I’m not saying you need to solve this problem for it to be useful. Because if you do you’ll probably be led down a long rathole like Guix has done.

                But I want to know what the claim is. I don’t know anything about the .NET ecosystem. Do they make any claims about reproducibility?

                Most language package managers like ecosystems like Python’s pypi, Node’s npm, don’t make any claims about reproducibility. It takes a lot of “extra” work to make builds reproducible. So if they don’t say it, then it’s almost certainly not. The burden of proof is on them.

                For example, some package systems let you upload binaries you built on your desktop (like R’s, and I think Python’s). And nobody knows what tools are on there. And even if they don’t – even if they enforce a CI/CD system – you don’t know what’s in there either.

                The problem is “recursive” – specifing --version just punts it to another layer. Are .NET packages reproducible? In a practical sense they may be stable, but that’s a different claim than “reproducible” or “deterministic”. When you say something’s a deterministic function, you have to specify what it’s a function of. Some package ecosystems are “stable” practically speaking but not reproducible.

                Like I said, it’s a big rathole. If you haven’t heard of it already you can google for “trusting trust attack”. The Guix authors are explicitly addressing it.


                edit: To be a bit more specific: What’s the complete set of transitive build dependencies of apt-tool? sources and tools?

                1. 1

                  I think the idea of a deterministic system in this case is you end up with all the same versions of software much like a npm packages-lock.json file would let you do. I don’t believe he’s referring to the reproducibility of the apt-tool software itself

                  1. 1

                    I think we are talking past each other a bit.

                    I’m not going for “reproducible builds” in the sense you are speaking of, I’m going for “reproducible root file systems with specific versions installed”.

                    Consider that apt-get update && apt-get install network-manager will always install the latest version, introducing unpredictability. The apt-tool command does the same thing (installing a package), but pins the version and locks it down via source control.

                    1. 1

                      Well I’m saying the post doesn’t make it clear what problem this is solving. You’re making a bunch claims with the words reproducible, deterministic, predictable, some of which already mean something specific in the Debian world.

                      There are limits to what you’re saying but it doesn’t acknowledge any of those limits.

                      You’re saying that the versions of the debs is locked? In the past I’ve included a file:// URL in /etc/sources.list and that can lock the versions too. It won’t retrieve anything from the network – only from the file system.

                      In other words, you need to be able to recompile older versions years later and have some confidence that the output is the same.

                      If this is your goal, then I would want to know what claim you’re making about the reproducibility of apt-tool.

                      If I have one of your JSON files years later, I can’t build it again unless I have apt-tool. This isn’t a theoretical problem – many I have tried to build software years later on a different OS, and the build doesn’t work. It’s actually a surprise when it does work.

                      You should link to any claims that .NET repo makes, and Debian makes. As you said elsewhere in the thread, Debian doesn’t necessarily keep old packages. In that case, I would consider the file:// solution to be better (and much simpler). What about the .NET repo?

                      1. 1

                        Well I’m saying the post doesn’t make it clear what problem this is solving.

                        The “Why?” section of my post is pretty clear. In short, apt is a rolling package manager, meaning apt-get update && apt-get install xxx will get you the latest version. Because of this, if I install a rootfs with debootstrap, I automatically get updates. I don’t want these updates automatically. I want to generate a rootfs with the same exact package versions for years on end, using the same image-lock.json file.

                        I understand that the Debian repos only have the latest version of the packages available, but for commercial linux appliances, you will typically would use a proxied apt repo that caches all upsteam debs locally, forever. Therefore, my image-lock.json will be valid for as long as the proxied apt repo stays around.

                        You’re making a bunch claims with the words reproducible, deterministic, predictable, some of which already mean something specific in the Debian world.

                        You are right, but these are the best words to describe what I’m trying to do, and the context of my post makes it clear. If you pick out one word from the post, of course it could mean something entirely different.

                        I would want to know what claim you’re making about the reproducibility of apt-tool.

                        apt-tool will use image-lock.json to generate the same exact rootfs, forever. That’s it. That means that apt-tool generate-rootfs will give you the same outputs, regardless of what the upstream apt repo is doing (provided the versions are still at least available).

                        You should link to any claims that .NET repo makes, and Debian makes.

                        This is getting pedantic. I think things are clear at this point.

                        I would consider the file:// solution to be better (and much simpler).

                        Not really. You moved the apt repo offline, but it’s still an apt repo that is rolling. If someone pushes to the folder, then your previous builds will all be different. Same problem.

                        Using apt-tool, you can sit on top of a normal rolling apt-repo (network or file), and let the tool manage fixing the versions.

                        What about the .NET repo?

                        What about it? Are you asking “what happens if the GitHub repository for apt-tool disappears?”