1. 19
  1.  

  2. 3

    My idea is that if distributions can switch to the saner defaults: -fno-semantic-interposition -Wl,-Bsymbolic-functions, the performance of our shared object world will be no worse than mostly statically linked PIE (mostly statically => some libraries like libc/libm are likely dynamic). This requires some evangelism, because the concept can be tricky to understand even by senior packagers.

    1. 2

      saner defaults: -fno-semantic-interposition -Wl,-Bsymbolic-functions

      What would be the trade-offs? What would break if those where the new defaults?

      1. 4

        It breaks LD_PRELOAD and it breaks having a program provide symbols that replace the library ones. Both of these are useful.

    2. 3

      If I am reading this correctly, it sounds like the difference is this:

      • By default, if you compile a shared object with GCC, then every function call to a function in the shared object goes through the PLT (essentially a vtable). So if you have functions foo() and bar() in the shared object, and foo() calls bar(), the call gets turned into PLT->bar(). This means slower calls, and no inlining, but it means that a program can hook a new function into bar() using LD_PRELOAD and it gets replaced for every call of bar()
      • With this change, if you have foo() and bar() in the same shared object, then foo() calling bar() may be a direct call, or may get inlined away. However, if bar() is a public symbol then it still exists in the binary, so programs outside the shared object can still call PLT->bar() the same way they always did. The difference is that if you replace bar() using LD_PRELOAD, then foo() still calls the old version of bar() internally.
      • Then there’s a whole bunch of complicated interactions with other ELF and linker options which change some cases to one style to the other.

      So it doesn’t make LD_PRELOAD stop existing or stop working, but does make it less powerful. The question is, if you hook your own function into bar(), is foo() calling it going to operate correctly anyway? If it has the same function type signature, I would hope the answer is “yes”, but who knows for real? Only someone with the source code to both. Replacing a shared object with a different one is pretty common and useful, that’s part of what shared objects are for. Replacing part of one, on the other hand, is a much less common and more fraught operation. Functions like malloc() manipulate global shared state, if you use LD_PRELOAD to replace malloc() in libc with your own implementation then you have to replace it everywhere. Other functions in libc are free to call it, and if they have a direct call rather than going through the PLT they will call the wrong version and corrupt your memory.

      This actually isn’t a dealbreaker: the solution is to replace the entire libc shared object with your version which has nothing changed except your custom malloc(). If you’re writing a custom malloc() anyway I don’t personally consider this too much of an inconvenience, though it is certainly an extra speedbump. It does mean you may or may not be able to patch a single function in a shared object you don’t have the source code for, though I might argue that if you’re in a situation where you have to do that you’ve already lost.

      All in all this seems like the kind of change that makes no difference to 99.9% of use cases and makes 0.1% of use cases horribly difficult. The question is, is breaking those 0.1% of use cases worth it? Having used LD_PRELOAD to solve real problems before, I think my call would still be “probably yes”.

      1. 1

        Functions like malloc() manipulate global shared state, if you use LD_PRELOAD to replace malloc() in libc with your own implementation then you have to replace it everywhere.

        malloc preloading is not affected by -fno-semantic-interposition or even -Wl,-Bsymbolic.

        -fno-semantic-interposition and -Wl,-Bsymbolic do not affect undefined symbols. -fno-semantic-interposition is per-TU while -Wl,-Bsymbolic is per-DSO.

      2. 3

        I understand the basic premise: virtual method calls for all non-static functions in all shared libraries hurts performance. But wow I do not have enough background to understand the other 95% of this article explaining the trade offs.

        This is why I like that Rust and Go use static linking. I appreciate the plight of distro maintainers needing to patch libraries and wanting to avoid rebuilding the world. But I honestly would rather download and reinstall my entire base system every week than care about dynamic linking. Distro maintainers already build packaged software themselves anyway, so it’s hardly a question of upstream developers being responsible enough to update their builds.

        This ship has already sailed for the cloud native crowd with docker and AMIs and immutable infrastructure. Same for MacOS—multi-gigabyte system updates are the norm. Maybe I’m just privileged with my gigabit internet, but the dynamic library dance seems like such a waste for such a small benefit.