1. 20
  1.  

  2. 3

    This is similar to what nixos, snap, flatpack, ostree, and guix(?) is currently doing. It’s an interesting concept but i’m more curious how distri keeps track of the dependencies in the packages. How do you track the openssl version that was used for curl at buildtime? How is this recorded and what is the tooling around this?

    1. 2

      Yes, there are similarities, because hermeticity is a desirable property to have :)

      OpenSSL is available in distri under /ro/openssl-amd64-1.1.1g-5. When building curl, curl’s build system will find OpenSSL under that path:

      […]
      checking for openssl options with pkg-config... found
      configure: pkg-config: SSL_LIBS: "-lssl -lcrypto"
      configure: pkg-config: SSL_LDFLAGS: "-L/ro/openssl-amd64-1.1.1g-5/out/lib"
      configure: pkg-config: SSL_CPPFLAGS: ""
      […]
      

      As the article outlines, we compile with the rpath set to a lib directory, and then create symlinks with the full paths we want to resolve each library to:

      % ls -l /ro/curl-amd64-7.69.1-8/lib/
      lrwxrwxrwx 1 root root 51 2020-05-07 00:11 libcrypto.so.1.1 -> /ro/openssl-amd64-1.1.1g-5/out/lib/libcrypto.so.1.1
      lrwxrwxrwx 1 root root 43 2020-05-07 00:11 libc.so.6 -> /ro/glibc-amd64-2.31-4/out/lib/libc-2.31.so
      lrwxrwxrwx 1 root root 48 2020-05-07 00:11 libcurl.so.4 -> /ro/curl-amd64-7.69.1-8/out/lib/libcurl.so.4.6.0
      lrwxrwxrwx 1 root root 44 2020-05-07 00:11 libdl.so.2 -> /ro/glibc-amd64-2.31-4/out/lib/libdl-2.31.so
      lrwxrwxrwx 1 root root 49 2020-05-07 00:11 libpthread.so.0 -> /ro/glibc-amd64-2.31-4/out/lib/libpthread-2.31.so
      lrwxrwxrwx 1 root root 48 2020-05-07 00:11 libssl.so.1.1 -> /ro/openssl-amd64-1.1.1g-5/out/lib/libssl.so.1.1
      lrwxrwxrwx 1 root root 46 2020-05-07 00:11 libz.so.1 -> /ro/zlib-amd64-1.2.11-4/out/lib/libz.so.1.2.11
      

      At runtime, when starting curl:

      1. OpenSSL will be searched in /ro/curl-amd64-7.69.1-8/lib/libcrypto.so.1.1
      2. …which resolves to /ro/openssl-amd64-1.1.1g-5/out/lib/libcrypto.so.1.1.

      Since package contents never change, this is always the same version.

      Hope that answers your question, let me know if anything is still unclear :)

      1. 2

        I think there is an important difference: do binaries share libraries when they’re at the same version? For example, sa you have 3 scripts running Python 3.8, and 2 scripts running Python 3.9 in each of these systems.

        Then do you have:

        1. 5 Python interpreters – 3 copies of 3.8, and 2 copies of 3.9.
        2. 2 Python intepreters – 3.8 and 3.9 are shared among the respective apps.
        3. Error: the two interpreters are not co-installable. (I don’t think this usually an issue for Python, since you can get it from a PPA (?) But it’s an issue for other similar software.)

        The answer is #1 for Nix and Guix.

        But what about snap and flatpak? I think they might duplicate the interpreters (#2), which doesn’t really scale IMO. It’s more like Docker, which also doesn’t “scale” in terms of having many binaries.

        I thought OSTree was solving a lower level problem, but I haven’t kept up …

        1. 1

          The answer is #1 for Nix and Guix.

          I am not sure I follow?

          $ nix-store -qR $(nix-build '<nixpkgs>' -A magic-wormhole --no-out-link) | grep python3-
          /nix/store/xnfcmgfhssgvkqq4vsnc89hwvfyfwcla-python3-3.7.7
          $ nix-store -qR $(nix-build '<nixpkgs>' -A youtube-dl --no-out-link) | grep python3-
          /nix/store/xnfcmgfhssgvkqq4vsnc89hwvfyfwcla-python3-3.7.7
          
          1. 1

            Oops, I meant #2… typo. #1 is to duplicate the interpreters like snap / flatpak (I think), and #2 is to share them (like Nix and Guix).

            1. 1

              Thanks for the clarification! I guess in Flatpak the interpreter could be part of one of the (shared) runtimess, but I don’t know enough about Flatpak to know if there is actually a runtime with the Python interpreter.

        2. 1

          I think it uses metadata files for this: https://repo.distr1.org/distri/master/pkg/

        3. 2

          I wonder if it would be possible to have multiple “distributions” that use different policies, e. g. provide the choice whether packages are statically or dynamically linked.

          I think it would be really interesting to learn some lessons there regarding development velocity vs. stability:

          E. g. how much easier would it be to change things around and experiment with different designs if it was guaranteed that all downstream users are statically linked against the exact library version, so incompatible updates would never break anyone.

          1. 1

            Package maintainers certainly can reduce the blast radius if they want.

            For example, as a user, you can install a package from a third-party repository (new development version of curl), and you can be sure that nothing else breaks on your system, even if curl pulls in a new OpenSSL version that is buggy.

            1. 1

              I was looking at it from the opposite direction:

              Having the option to be sure that there is only one, single, dynamically-linked instance of each security-critical library installed.

              Such that updating that library is enough to secure all applications using it in case of a security issue.

              1. 1

                Yes, that is the status quo in many systems (e.g. Debian). distri makes different trade-offs :)

                1. 1

                  I wonder if it’s possible to get distri’s benefits without having to forgo dynamic linking. :-)

          2. 1

            It’s unbelievable to what extent people will go to avoid static linking.

            1. 1

              Interesting that shells were too slow for the wrapper. dash is the best shell for that, although maybe it’s still too slow?

              https://lobste.rs/s/dnfxpk/hello_world#c_zjxrhd


              Another option could be execline, which I’ve never used:

              https://skarnet.org/software/execline/grammar.html

              execline is the first script language to rely entirely on chain loading. An execline script is a single argv, made of a chain of programs designed to perform their action then exec() into the next one.

              I guess the C program is simple enough, but you could also write a single C program like execline and accomplish the same thing?


              Anyway, very informative post! I look forward to hearing more about distri.

              Could distri be used “on top” of another distro, like Nix? I think distros really should be split in half – all the stuff that depends on hardware, which is complex, and then all the portable stuff (shell, Python, Ruby, etc. and everything upward). I think there is too much coupling between these layers in most distros.

              Oil’s dev env (which should be in the portable upper half) was partly ported to Nix by a contributor, but it turns out to be hard to get the tests pass. This is due to the versions of each package being different from Ubuntu, and also the surprising number of weird patches (probably in both Nix and Debian/Ubuntu, but more in Nix).

              https://github.com/oilshell/oil/blob/master/shell.nix

              So I would really like to find a hermetic “semi-distro” to put all the Oil dev tools in. So people can just run one command and download deps and build. Nix is pretty close to that, but there seem to be some problems in practice. (And yes now I have some first hand sympathy with the complaints about Nix’s expression language… )

              1. 1

                I guess the C program is simple enough, but you could also write a single C program like execline and accomplish the same thing?

                Probably, but then that single programs still needs to be configured, which takes time away. The advantage of compiling the program at package build time is that it can be even quicker.

                Anyway, very informative post! I look forward to hearing more about distri.

                Nice! Find a list of posts at https://michael.stapelberg.ch/posts/tags/distri/, and subscribe to https://www.freelists.org/list/distri if you want to reach out and discuss :)

                Could distri be used “on top” of another distro, like Nix? I think distros really should be split in half – all the stuff that depends on hardware, which is complex, and then all the portable stuff (shell, Python, Ruby, etc. and everything upward). I think there is too much coupling between these layers in most distros.

                To an extent. There are a couple of paths which certain packages treat as special. For example, glibc’s NSS mechanism loads plugins from /usr/lib. GCC will consider /usr/include as the system include dir. Using not just distri’s packages, but also its file system layout, helps in these cases.

                This is just one caveat that comes to mind. I have indeed used distri packages on Debian and Arch before.

                Please also take a look at https://michael.stapelberg.ch/posts/2019-08-17-introducing-distri/#project-outlook — I’m not looking to use distri productively (only for research).

                1. 2

                  Ah OK the /usr/include is interesting. So are multiple versions of the same compiler/libc co-installable in distri? The FUSE indirection solves that problem?

                  I ran into a related issue compiling code with a “nightly” build of Clang recently. The nightl build appears to use the system libc++ which is GCC’s, but there is an extra flag to compile with the libc++ that comes with Clang itself. I remember having a hard time figuring that out.

                  I think the issue was some C++17 features like <optional>, which made it hard to compile a lot of software on my system, even with a non-system compiler that supported C++17.


                  A long time ago, I tried to make hermetic packages with chroots, which sort of works. But I came to the conclusion that having the FUSE layer would probably make things more efficient. But at that time I didn’t want to depend on FUSE.

                  I think distri overlaps with what I want but it sounds like your goals are also a bit different. Have you looked into other hermetic distros, i.e. ones where library dependency versions are fixed ? I don’t think there are that many. Off the top of my head, there’s only:

                  1. Nix and Guix (as mentioned I have tried Nix, and it’s OK, but I’m still looking for something else)
                  2. distri which as you say is a research project for now (and the issues you are uncovering and documenting are interesting)

                  I can’t think of any others that don’t duplicate the entire dependency chain, which I don’t want …

                  Basically I want a binary-centric distro and not a library-centric one. Debian seems to treat them as on “equal footing”.

                  IMO binary stability (e.g. firefox, Clang compiler, VLC, Inkscape, Python interpeter, Python apps like hg) is more important than having exactly one version of a library on every machine.

                  This is more of a shell-centric point of view, e.g. for Oil. The shell cares about binaries and not libraries. I think that is a more scalable and reliable way of composing software. That is, once you reach a certain point, you start using binaries and versionless protocols (either shell or IPC/RPC), not libraries with incompatible upgrades.

                  1. 1

                    So are multiple versions of the same compiler/libc co-installable in distri?

                    Yes, and for building distri packages, there is no ambiguity because only one version will be visible in the build environment. For interactive builds (done by humans, outside of the distri build environment), FUSE will serve symlinks to the most recent version of each file in /usr/include, so if you need something more specific, it’s up to you to arrange that.

                    Basically I want a binary-centric distro and not a library-centric one. Debian seems to treat them as on “equal footing”.

                    binary-centric is a good term! I agree with your desired goal here :)