1. 34
  1.  

  2. 10

    I don’t have a great deal of sympathy with this.

    FreeBSD’s run-time linker has supported GNU-style hashes since 2013. I think glibc has supported it for at least 2-3 years longer. Almost anything linked since then will have used --hash-style=both and so not notice. Anyone who has created a binary since then that this doesn’t work with has explicitly chosen to opt out of faster load times.

    If you want a compat version for such programs, then you can always build glibc with --hash-style=both. You could easily create a container base layer for running such programs.

    1. 12

      To clarify: when you say you don’t have a great deal of sympathy “with this”, do you mean the situation on Linux? Or with this article? Because if the latter, the entire point here is that WINE has provided an unintentionally stable ABI despite Linux distros not reliably doing or supporting those things, and if the former, his entire point is that Linux isn’t doing things it could do to avoid this in the way that it sounds like FreeBSD did.

      1. 8

        To clarify: when you say you don’t have a great deal of sympathy “with this”, do you mean the situation on Linux?

        Specifically with the complaints about removing DT_HASH. DT_HASH is required by ELF, but the ELF specification is from 1988. To put that in Windows perspective, that’s back when Windows 2.1 was the latest release.

        The shift from DT_HASH to DT_GNU_HASH has been very gradual. They were introduced over 10 years ago, with system libraries since then being linked with --hash-style=both, which puts both DT_HASH and DT_GNU_HASH in the resulting ELF files. This is not ideal because:

        • DT_HASH is quite a bit larger than DT_GNU_HASH and so you’re carrying around a fairly large amount of metadata for legacy compat.
        • DT_HASH is a lot slower than DT_GNU_HASH and so anything that actually tries to use it will launch more slowly.

        Anything that has been linked with the default settings in the last 10 years will quite happily work with new versions of a shared library linked with --hash-style=gnu.

        For anything else, *NIX systems generally use chroot or similar for compat layers. On FreeBSD, the recommended way of doing this is with a jail (though there are also some special hooks in the kernel for old FreeBSD versions and foreign binaries that will make them search a different path in the FS before looking in the one that they ask for, so that you can provide old versions of shared libraries but still use newer data files). On Linux, it’s generally an OCI container (I wish FreeBSD would embrace this ecosystem more).

        You have a choice of either keeping the old versions of the libraries in here, or building new ones with --hash-style=both in LDFLAGS to populate the compat ecosystem.

        This article is focused entirely on glibc, which is a GNU project. I personally loath, hate, and detest having to work with glibc, but it has an incredibly strong track record on backwards binary compatibility. Glibc was the project that first pushed symbol versioning, allowing it to introduce new (incompatible) versions of functions and still provide the same ABI to older things.

        On top of that, I find the framing of ‘on Linux’ annoying. Linux is a kernel. The Linux kernel has very strong ABI guarantees (though the worst KBI guarantees of any competing system). Binaries from Linux 1.0 will run on the Linux kernel today. The complaint is that specific distributions do not have a stable ABI. A distribution is a combination of bits of software that provides an entire environment. Android is the most popular Linux distribution and it does have a fairly stable ABI. Individual distributions provide different guarantees. Most of the software sits on top is not specific to Linux and will happily run on other kernels.

        The specific complaint here is that certain distributions have chosen to optimise for size at the expense of binary compatibility with software that is >10 years old in their default linker flags.

        1. 3

          Specifically with the complaints about removing DT_HASH. DT_HASH is required by ELF, but the ELF specification is from 1988. To put that in Windows perspective, that’s back when Windows 2.1 was the latest release.

          As you mentioned in your previous post, DT_HASH was the only option until circa-2011. The Windows perspective would be dropping support for 32 bit, since by 2011 64 bit binaries were being created. That’s not going to be on the cards for a long time - Windows cares about 2008 binaries and (frankly) has the largest Steam catalog and user base on Steam as a result.

          The specific complaint here is that certain distributions have chosen to optimise for size at the expense of binary compatibility with software that is >10 years old in their default linker flags.

          I don’t think that’s the issue. The issue is that a stock 2.36 glibc cannot load binaries that are a) more than 10 years old; b) linked with mold or another linker that only supports DT_HASH; c) compiled on a distribution that specifies –with-linker-hash-style=sysv. The distributions forcing DT_GNU_HASH outside of glibc are a bit of a red herring - those binaries work before and after the glibc change.

          I suspect that final bucket is much larger than you’re suggesting, because anyone distributing binaries tends to intentionally use an older toolchain in order to support the install base of distributions. It seems logical in 2016 to use a 2012 distribution as the basis of binary distribution. Continuing the Windows analogy, seeing new 32 bit binaries in the mid to late 2010s wasn’t that strange.

          The strange thing here is this is obviously an ABI break, and people are lining up to deny that fact or suggest it doesn’t matter. Does glibc also have a policy on removing versioned exports from before 2012? Would it be functionally any different? I’m sure it’d reduce code size slightly, so if that’s the measurement of success, the tradeoff looks similar.

          1. 2

            Correction:

            • DT_GNU_HASH was added to glibc in 2006. My https://groups.google.com/g/generic-abi/c/9L03yrxXPBc/m/WKuUjZshAQAJ summarizes the status for many OSes.
            • The glibc 2.36 change was to drop DT_HASH for glibc’s own shared objects (e.g. libc.so.6, libpthread.so), not the support for user shared objects. The ability to interpret DT_HASH for user shared object will likely never be dropped.

            If you don’t mind reading glibc threads, this reply from Carlos has a nice summary: https://sourceware.org/pipermail/libc-alpha/2022-August/141304.html (“Should we make DT_HASH dynamic section for glibc?”)

            Dropping DT_HASH for glibc’s own shared objects is certainly fine. “Easy Anti-Cheat” software is reading too much from what a glibc’s own shared object provides. It’s just another demonstration that all observable behaviors will be depended on by someone.

            FWIW A deeper issue is that there is no good forum for Linux/*BSD collaboration on certain ELF features. It’s clear that Solaris won’t like certain features. For now when a sufficiently valuable feature arises but has no chance entering generic ABI, I’ll keep notifying generic-abi but probably loop in binutils / some BSD toolchain folks.

        2. 4

          That, and it sounds like DT_GNU_HASH is obscure due to its position as a small piece of ELF, and not very well documented.

        3. 3

          Anyone who has created a binary since then that this doesn’t work with has explicitly chosen to opt out of faster load times.

          And anyone who has a binary created before then is out of luck. Backwards compatibility is a great property for a platform to have.

          Raymond Chen has written extensively on this approach at Microsoft. E.g. https://devblogs.microsoft.com/oldnewthing/20031224-00/?p=41363.

          To put it mildly, I’m not a fan of Microsoft or most of their products. But this is one area they get right (or, at least, did back in the 2000s when I was a Windows dev).

          1. 4

            My objection is that this is a very niche use case. It’s really only a problem for things that dynamically link libc, and don’t depend on any other libraries that don’t also have multi-decade backwards compatibility guarantees. This is an incredibly tiny subset of all programs and not worth increasing disk space and download sizes for everyone else to handle. It’s also a case that can be trivially worked around by using a container: the Linux kernel has very strong ABI guarantees and if you build a container image from a snapshot of a mid-90s Linux distribution then it will still happily work on a modern Linux system.

            I work at Microsoft and I’ve had a lot of conversations with the Windows team in recent years. Their backwards-compatibility guarantees come with a lot of downsides. For example, we can’t deploy kernel security features such as SMAP because third-party drivers (especially antivirus kernel-mode drivers) go and poke at userspace memory without any kind of copy-in / copy-out discipline.

          2. 2

            I’m not sure you understand the problem. It’s no problem to have a DT_GNU_HASH section in your elf. The missing DT_HASH section is the problem. So if you write a program, which is for any reason[0], interested in the symbols of an elf you might expect this section, because it’s specified that you have one. So yes I could build all my distro with –hash-style=both again. But then what is the point of the gABI defining requirements?

            What makes this even better is, that currently most compiler decide to set –hash-style=gnu by default. I don’t think this is the right point for the change. A better place for this change would be the default of the linker or the LDFLAGS for the package building process.

            Last thing is that DT_GNU_HASH has (according to the article) no spec. So currently I’m required to look at the implementation to make use of it[1][2]. Is this the way we want to specify an ABI?

            So do I have sympathy with the users of the games with don’t work anymore because the vendor requires some bullshit software which don’t work anymore? Only to some extend, yes the just want to use there software. But also they normalize the use of this kind of software. But just because I don’t like the messanger doesn’t mean it’s not a problem.

            [0] In this case an anti-cheat program, which I believe doesn’t understand that the check for correct system libraries don’t work as dependent binary.

            [1] You might want to ask the PE implementations who good this works ;-)

            [2] How does this imply the license of the code?

          3. 4

            I think this whole situation shows why creating native games for Linux is challenging.

            Just release the code and only lock down the assets. Release the assets couple years after. Congratulations, your game will most probably live forever.

            1. 5

              Okay but how does that solve anything? Your players aren’t going to download the source and build it themselves; they’re going to get it through Steam, which is a binary distribution platform.

              1. 3

                Steam also takes on the compatibility problem. If you build for Steam, they promise their own compatibility guarantees. I have no idea what they are in relation to DT_HASH, but I’m sure they have one.

                1. 4

                  I doubt they need one. For DT_HASH to be a problem, you need:

                  • To have not relinked your program in the last 10 years, or to have relinked it explicitly with non-default linker flags, and
                  • To be dynamically linking to a linking to a library that has switched from both to gnu.

                  In general, I’d expect a platform like Steam to ship things as something that look a bit like container images, even if they’re not actually containers: they depend on the system call interface and nothing else, any shared libraries are shipped along with it. If you ship a binary that either statically links libc or which uses -rpath and bundles its own dynamically linked libc, then you don’t have these problems.

                  It’s worth noting that Xbox games are distributed as VM images with specific versions of Windows libraries to avoid cases where a Windows library uses 1 MiB more RAM and causes the game to start swapping. A Linux system could very easily ship games as OCI containers and get similar guarantees.

                  1. 1

                    Agreed. I know Steam ships it’s own libc libraries, but I don’t know any more than that, I’ve never looked into it. Awesome how Xbox does it! that’s really interesting, thanks for the info!

                2. 3

                  Also, the main example in the article is an anti-cheat system, which almost by definition has to be distributed as an opaque binary and not source code.

                  1. 2

                    Steam has done a great job at getting games onto Linux, but the solutions it provides do not work for games bought outside their store.

                    If I want to play a game from itch.io on windows it’s easy. On Linux it’s almost always a faff.

                    This is an issue because steam is not a good marketplace for all games and because monopolies are bad.

                  2. 1

                    Releasing EAC source code is literally (!!!) the worst possible solution, because the entire point of the anti-cheat system is to obfuscate the source code to prevent players from secretly modifying their client to cheat.