1. 2
  1.  

  2. 4

    The C++ standards committee regards shared libraries as out of scope for the language abstract machine. It’s probably too late to fix that, but it’s led to some very interesting things. For example, what happens if you have a thread_local variable with a non-trivial constructor and destructor in a shared library that is dynamically loaded and unloaded? The only answer for construction that is possible to implement is that the thread-local variables must be lazily initialised on first use. It would be nicer if all threads could be interrupted and their TLS initialised, but that would restrict constructors to using signal-safe functions, which is probably a bad idea.

    Destruction is more complicated. If you unload a shared library, what happens to all of its TLS variables and when do their destructors get run? One answer is that the library is locked and isn’t allowed to be unloaded until all threads that have accessed thread-local variables from that library have gone away. This is problematic because there’s no way in C++ of explicitly indicating that you won’t use a thread_local variable again and so this really means that you can’t unload a library until any thread that has ever called into the library has exited, which is roughly equivalent to saying that you can’t unload libraries[1]. The other is that you track the function that owned the library and silently don’t run the destructors. This means that unloading a library can be a resource leak (imagine, for example, a TLS variable that’s a smart pointer holding a refcount for something big in the main program).

    Part of this problem is that both Windows and UNIX conflate the idea of a shared library that exists for separate compilation but which is always loaded at process creation time (the separate-compilation and code sharing use case) with a shared library that is dynamically loadable (the ‘plugin’ use). If these were fully separate then we at least wouldn’t pay the penalties of an ABI designed to allow the second case when we used the first. We could just have a .tinit_array / .tfini_array section for thread-local constructors and destructors in non-plugin code and run them all on thread creation / destruction. This would let you do things like count executing threads with an atomic counter and a thread-local object that incremented the counter in its constructor and decremented it in the destructor.

    [1] I’m actually okay with this. Library unloading is a common source of security vulnerabilities and I’ve never seen a production use of it that couldn’t have been better accomplished by some other mechanism.

    1. 1

      Windows and UNIX conflate the idea of a shared library that exists for separate compilation but which is always loaded at process creation time (the separate-compilation and code sharing use case) with a shared library that is dynamically loadable (the ‘plugin’ use).

      Isn’t part of the issue that a ‘plugin’ shared library can only depend on other ‘plugin’ shared libraries? If a plugin depended on a process-creation library, then unloading the plugin could not unload its dependency. But if all common/base libraries were compiled to support plugins, the ABI benefit goes away; or we could have two different versions of the C runtime, and give up the benefit of code sharing between them, etc.

      1. 2

        We already use different ABIs on *NIX platforms for accessing C thread-local storage between the main program, libraries that are linked to the main program, and libraries that can be dynamically loaded. Unfortunately, because almost any library can be dynamically loaded, we end up compiling everything with the slow mode. The C++ ABI didn’t differentiate between these two and uses the same construction and destruction mechanism for everything, independently of what kind of object file it ends up in.