1. 30

This is in response to: https://old.reddit.com/r/archlinux/comments/uqsy8v/are_rust_binaries_a_security_concern_because_of/

  1. 28

    Yup, this is a solvable problem, but needs a different approach than C. However, distros have firmly settled on a C-specific solution, and prefer not to adapt.

    The status quo is untenable. I think CD-ROM era distro package managers are dying: they struggle to handle Chrome-style evergreen updates. They struggle with JS, Python, Golang, and Rust. People would rather ship multiple copies of an entire operating system (Docker) than try to use the package manager as intended for more than one application at a time.

    1. 4

      Yes. Agreed. A lot of good things died along as well though, this is most unfortunate. You don’t have package maintainers to independently validate a new version (supply-chain attack). There is no package maintainers to backport a zero-day fix for all affected packages that depends on OpenSSL. There is no compilation diversification (that probably died a while ago though) to avoid “Reflections on Trusting Trust”.

      I hope some of these ideas can survive into a new form. I just don’t know how.

    2. 11

      One thing that’s missing from this analysis of distro packaging tools is that the distro model generally wants to avoid upgrades whenever possible. So this step:

      crossbeam-utils just got an update! Distro™’s build automation queues all binaries which include crossbeam-utils in their dependency tree to be repackaged.

      would not apply, or would apply only to “crossbeam-utils just got a security update, or other bug of sufficient severity to be covered by the distro’s policies”.

      So, suppose package bar depends on libfoo, and at the time the distro rolled its current edition, that was bar 2.1 and libfoo 1.3.5. If libfoo 1.4 comes out fixing a critical security issue (or other bug of sufficient severity, etc.), the distro does not switch to packaging libfoo 1.4. They backport the fix into a forked tree of 1.3.5, and package it as libfoo 1.3.5-1 or whatever.

      And that leads into the real mismatch between the distro’s packaging and any language-specific package manager – cargo or npm or pip or gem or literally any of them – which is that the distro wants stability, while the language-specific ecosystem wants to grow and evolve.

      I suspect that the static versus dynamic linking arguments are really a red herring here: the deep issue is the fact that distro package managers simply are not set up for the pace of change in the typical language package ecosystem.

      1. 3

        On the other hand, that backporting for stability is mostly required because it’s a shared library whose interface is exposed to the user and other applications, so bumping to 1.4 could cause unrelated breakage. If you allowed library dependency versions to float as needed per binary then stability wouldn’t be affected, provided this didn’t substantially change the user-facing behaviour of the binary that incorporated it. (This could still require a fork to address.)

        It’s worth pointing out that having multiple versions of a library simultaneously is nearly unavoidable - having multiple versions of the same library already occurs easily in the context of a single Rust binary, let alone between multiple Rust binaries.

        1. 2

          So, to extend the example: if Distro™ packaged bar 2.1 using libfoo 1.3.5, they’re generally going to stay on bar 2.1 for the entire life of that distro release. Which means they’re going to stay on the dependencies specified by that version, and maintain those as-is.

          Which then does end up being a lot more work for the distro since they need one distro-local fork per library per depended-on version of the library, because again: they’re not going to do in-place upgrades. And that gets back to the fact that the language package ecosystem moves more quickly than the distro package ecosystem, and in ways that make the distro packager’s life hard.

          Personally I think distros should more or less give up on trying to ship most things that are accessible from language-specific ecosystems, because hardly anybody even uses the distro versions these days anyway. But I also think it’s important to point out that this is the real disconnect, not the type of linkage.

          1. 4

            because hardly anybody even uses the distro versions these days anyway

            [citation needed]? I suspect there’s a large, silent swath of people who prefer distros’ stability guarantees.

            You’ve hit the nail on the head with this disconnect though. I think upstreams just generally don’t see the value in doing an LTS-type release, which distros would prefer to use if possible. So this results in an awkward situation where distros who don’t have upstream expertise are basically maintaining an unofficial LTS release instead.

            1. 5

              I suspect there’s a large, silent swath of people who prefer distros’ stability guarantees.

              Hi!

              As a person who used to do a lot of programming and now does some programming and a lot of ops, my perception is that software developers tend to want to develop with the latest version of everything, and for everyone to use the latest version of everything, so they don’t have to keep doing boring backporting.

              Mere users would generally like the ability to upgrade for features if they want, but don’t want to have to keep upgrading everything all the time just to get security fixes and so on. Backward compatibility is, as far as I’m concerned with my end-user hat on, a binary property. It doesn’t matter how well-telegraphed a breaking change was.

              The backward compatibility model I need for software that’s not my specific focus is simple: security bugs are fixed, nothing else ever changes. Some (usually big) projects do provide this, but for the most part it’s not what software developers like to do. I can’t complain about that—I’m not paying them to care about my requirements—but I am grateful for stable distros’ turning software developers’ output into something I can rely on.

              You tend not to hear from, e.g., Python people who feel this way about Python software. Python people will get hold of the Python version they want to use. But any successful language has a lot of transitive users who don’t care about that language, or even know what it is. Distro packaging is for them.

              1. 2

                As a person who used to do a lot of programming and now does some programming and a lot of ops, my perception is that software developers tend to want to develop with the latest version of everything, and for everyone to use the latest version of everything, so they don’t have to keep doing boring backporting.

                [citation needed]

                It depends on what ecosystem you’re in, of course, but I actually hate developing with javascript/css/html/etc because of how fucking fragile the versioning is. Every single day there’s a new version, who knows what it will break and what I will have to redo. And every time there’s a major update something fucking breaks because of how fragile and incoherently-built the technology is.

                All of that is wasted time.

              2. 3

                In my own world (Python), I don’t know of anyone who installs distro-packaged versions of their Python dependencies. Everyone uses a language-specific package manager: either pip for most networked services/apps, or conda for data-science/ML stuff. And all the tutorials people copy/paste from are setting you up that way, too, so it’s how people get onboarded.

                And generally that’s how you have to do it in Python, because the distros can’t and don’t package enough of PyPI to really handle all those use cases. They package only a subset of the most popular things, but many projects will have at least one or two dependencies that didn’t quite make the cut, and since you really don’t want to mix package management systems, you end up all-in on the language’s package manager.

                I will mention, though, that some projects do put out LTS releases. Django, for example, does an LTS every third release and uses them as an explicit part of the compatibility policy (which, summarized, is that if you’re running on an LTS and raising no deprecation warnings, you can upgrade to the next LTS with no further code changes).

                1. 2

                  In my own world (Python), I don’t know of anyone who installs distro-packaged versions of their Python dependencies.

                  Sure, this is because distro packages of Python libraries aren’t really for you. They’re only there as support for installable applications.

                  Your world doesn’t have the same constraints as everyone. An ops person is much more likely to prefer distro packaging when possible because it reduces hassle, both at install time and at maintenance/upgrade time. This is where the stability guarantees of distributions, which upstreams often refuse to provide, add value.

                  1. 1

                    Sure, this is because distro packages of Python libraries aren’t really for you. They’re only there as support for installable applications.

                    Except my point is that for extremely common Python use cases like deploying a web application, the distro packages are inherently insufficient. Distros simply aren’t able to package all of PyPI, and getting mostly there for a given project doesn’t count, because you absolutely never want to mix distro packages and language-package-manager packages.

                    An ops person is much more likely to prefer distro packaging when possible

                    I understand that the traditional sysadmin position has always been to favor the system package manager and discourage language-specific package managers, but again: the distros can’t and don’t package enough stuff to make that feasible. If someone wants to deploy something that has 20 dependencies and 19 of them are in the distro package repository, the 20th will force the project back to the language-specific package manager.

                    1. 1

                      Ah, I think I was unclear - my bad. By “installable applications” I really meant, installable from the distro repository. I’m totally with you that you probably don’t want to mix e.g. Pip packages coming from PyPI and from the distro in the same Python application process. Distros can’t package every Python application out there but if it’s available in the repository, I’m installing from there instead of from Pip.

                      To be clear I’m not talking about in-house developed webapps. I’m talking about existing projects that I might want to use for whatever reason that have been open source for a while and are being packaged by distros. For deploying a proprietary webapps, I agree that using Python libraries from the distro repository is probably a bad idea. That’s what I meant by this:

                      distro packages of Python libraries aren’t really for you. They’re only there as support for [applications that the distro is packaging].

                      Hopefully that clears things up.

                      1. 1

                        I think the number of deployments of things that fit your definition is much lower, relative to the number of language-package-manager-backed deployments, than you think it is.

                        1. 1

                          And I obviously disagree :P but really I think we’re both biased by our worlds. Of course someone who spends all day in a language ecosystem would think that’s how most people deploy software. Likewise, of course someone sysadmin-y who really leans into distros’ advantages would think there’s a lot of silent people out there who feel the same. We’re both just guessing, and neither of us are making particularly educated guesses.

          2. 1

            but the frustrations in OP exist with archlinux just as well as with debian. what you describe, a general reluctance to upgrade is also typical in distros (though there’s a pretty significant long tail to how out of date various distros are), but IMO not inherently connected to the build system and packaging model this post is talking about.

            1. 2

              I’m confused. Arch Linux and Debian does not share the same model of packaging Rust/Go/Javascript.

              We have mostly given up and leave library updates and security issues up for the upstream, as that is what the tooling is currently encouraging.

              1. 1

                you’re right, i must have misremembered how e.g. python vs python-venv is packaged there. i do remember some early attempts at packaging rust applications in AUR that looked much more like Debian’s attempt.

          3. 6

            Using old.reddit.com because I personally prefer it. Also had to make up a title.

            As with my latest submissions, I keep missing tags. In this case something like package-management.

            1. 5

              As someone who’s being using dpkg since the mid 90s I’m quite attached to that model of software distribution. An expectation that I can install basically any software on my computer and that the computer will keep it up to date without much intervention from me is what makes using Mac and Windows machines frustrating and painful for me. I run Linux because I don’t want to have to sysadmin my personal computers. I’m not willing to give that up.

              I think the modern deb/dpkg/apt model carries enough information to solve the problem being described here.

              There has long been the concept of build-time dependencies - packages that need to be installed to build but not to run a particular package. These are either tools like compilers or libraries that are linked at compile time (C++ header libraries, .a, Rust rlibs).

              The reproducible builds effort has defined the .buildinfo file that describes the build environment, including which versions of which packages were used to build a particular binary package.

              If probably doesn’t make sense, especially given the left-pad oriented packaging culture of Rust, for every Rust package micro version bump to trigger the rebuilding and reinstallation of every one of its transitive dependents, but a policy of what to rebuild and when can be built on the existing infrastructure and semantics.

              1. 2

                You can reasonably package only object libraries. If the API (header) changed, all dependent objects have to be rebuilt. For rust, this also means invalidation on compiler update. With that out of the way, a dependency update should only rebuild the dependency once and then relink all depending libraries/applications. And today, linking a binary is freaking cheap.