1. 54
    1. 6

      The SystemV ABI for i386 makes this clear. If you’re going to return a struct, you have to use the address of the struct and place it on %eax, no way around it. Even if it fits into registers. This is apparently called an sret.

      Three cheers for LLVM’s “C calling convention” (not really C calling convention) calls, that need an absurd amount of fixup from every language frontend to implement things LLVM should damn well know to do on its own. sret/inalloca is only the start; wait till you learn how you have to pass structs under 16 bytes containing numeric fields on x86_64. So when the manual says “[The default calling convention] matches the target C calling convention”, you should be aware that this is an outright lie.

      (Remember: when in doubt, clang -c -save-temps -emit-llvm test.c && llvm-dis test.bc && less test.ll, no this is not a joke)

      1. 9

        This specific example is really fun because, as far as I know, Linux is the only surviving platform that uses that ABI. *BSD (including macOS) use the small-struct optimisation, where structures that fit in two registers are passed and returned in registers. This is particularly noticeable because the SysV ABI says that unions should be treated as structs for the ABI and so a union of a long and a void* on i386 Linux is returned by having the caller allocate 32 bits of stack space and pass the address of it in an argument register, then having the callee write the value there, then have the caller read the value from the stack. On *BSD, the same code compiles to the callee putting the value in a register.

        How do you represent a 2-pointer struct return in registers in LLVM for i386? You return an i64. This then causes the mid-level optimisers to do a bad job because the callee has to extract two pointers from an integer and so any provenance-based alias analysis has to assume that (because they went through an integer) they can alias anything.

        This has been my least-favourite part of LLVM for over 10 years. There is an implicit (and, often, undocumented) contract between front ends and back ends for the ABI. The best proposal that I’ve seen for fixing it was to use function attributes to explicitly describe whether things live in registers or on the stack, so the IR for any 32-bit or 64-bit platform is exactly the same aside from the layout. This is easier for new front ends because the psABI is always documented in terms of C types and registers / stack slots. If you want to interop with C code, then the thing that matters to you is that you put your values in the places where C expects values of a specific C type to live, not values of a specific LLVM IR type. LLVM IR types are strictly less expressive than C types (no signed vs unsigned, no _Complex, and so on), so you need to communicate the additional information somehow.

        Once that machinery exists, LLVM could add a builder that encodes each psABI, so that you can provide C types and get a matching LLVM IR function. Most of this exists in some form in clang already, but could be made a lot more generic. It would also mean that things like Erlang and Haskell that have language-specific calling conventions wouldn’t need back-end modifications to support a new target, they could just add something to their own register choice.

        If a function is inlined, then all of the register / stack info is silently discarded and there’s no code for mangling things to the back-end’s IR representation necessary.

        1. 3

          Hey, could be worse. Could split structs across register and memory…

          I think I’m running into issues with sumtypes because llvm needs an alloca for every union conversion, and that gets really annoying when almost every function call incurs a union access. (Error propagation.) By what you’re saying, that’d also hurt me in optimization, for the same reason. Maybe it’d help there to have a way to say “yes, I’m loading a T pointer from a <byte>, but trust me that it was created by storing on a pointer with the same type and never read with any other type”?

          1. 4

            You can do this with TBAA, but that doesn’t help with provenance-based AA. Anything coming from an inttoptr needs to assume that it can alias any other pointer. The user might have done some arbitrary pointer arithmetic to materialise the value. You can use an inbounds GEP for arithmetic and keep things pointer-typed to tell the compiler that it’s UB if the pointer points to a different object at the end. LLVM doesn’t use pointee type in alias analysis (and pointee types are now finally gone from the IR, hurray!), so you will get TBAA only if you explicitly provide TBAA metadata.

            Passing pointers through memory hurts AA, but SROA should promote most of those alloca‘s to SSA registers and then the problems go away, unless you’re casting (bitcast or inttoptr) via an integer type.

            1. 2

              Ah, I understand. Thanks.

    2. 1

      Yeah… Serenity is fundamentally a C++ project and it shows. The C standard library itself uses C++’s runtime type information in order to link properly. It’s unfortunate, but it works, so it’s fine.

      I would love to hear more details about this. I’ve never seen RTTI used in linking before!

      1. 2

        I probably misspoke there and meant “needs C++ RTTI symbols in order to link”.

        1. 1

          I’m still not sure what this means. What type_info structures does it need to exist in linked code?

          1. 2

            I had to dig it up, but here’s the exact original error:

            ld.lld: error: undefined symbol: vtable for __cxxabiv1::__class_type_info
            >>> referenced by spawn.cpp
            >>>               spawn.cpp.o:(typeinfo for AK::Function<int ()>::CallableWrapperBase) in archive /bitplane/Serenity/Build/i686/Root/usr/lib/libc.a
            >>> the vtable symbol may be undefined because the class is missing its key function (see https://lld.llvm.org/missingkeyfunction)
            
            ld.lld: error: undefined symbol: vtable for __cxxabiv1::__si_class_type_info
            >>> referenced by spawn.cpp
            >>>               spawn.cpp.o:(typeinfo for AK::Function<int ()>::CallableWrapper<posix_spawn_file_actions_addchdir::'lambda'()>) in archive /bitplane/Serenity/Build/i686/Root/usr/lib/libc.a
            >>> referenced by spawn.cpp
            >>>               spawn.cpp.o:(typeinfo for AK::Function<int ()>::CallableWrapper<posix_spawn_file_actions_addfchdir::'lambda'()>) in archive /bitplane/Serenity/Build/i686/Root/usr/lib/libc.a
            >>> referenced by spawn.cpp
            >>>               spawn.cpp.o:(typeinfo for AK::Function<int ()>::CallableWrapper<posix_spawn_file_actions_addclose::'lambda'()>) in archive /bitplane/Serenity/Build/i686/Root/usr/lib/libc.a
            >>> referenced 2 more times
            >>> the vtable symbol may be undefined because the class is missing its key function (see https://lld.llvm.org/missingkeyfunction)
            
            1. 2

              Okay, so it looks like their libc needs to be linked to a C++ runtime library (libsupc++, libcxxrt, libc++abi)? If you’re static linking, you need to add this explicitly because *NIX static libraries aren’t really libraries, they’re just archives of .o files. That doesn’t mean that it requires RTTI for linking, it just means that it depends on a C++ runtime. I’m a bit surprised that they enable RTTI in libc, I would generally expect libc code to be compiled with -fno-rtti -fno-exceptions, but it is useful to have C++ thread-safe statics in libc, so you do want at least the __cxa_guard_* functions from the C++ runtime.

              1. 1

                It’s pretty all in on c++ afaict lambdas and everything. Not that those require rtti (I don’t think), but I wouldn’t be surprised by an internal use of or exceptions or rtti

                1. 2

                  SerenityOS does not use exceptions, but it does make use of RTTI via its use of AK::Function (similar to std::function) within LibC.

                  1. 2

                    That does that use RTTI for? Most implementations of std::function (all of the ones that I’ve read, but I haven’t read all of them) work fine without RTTI. They use a templates constructor that wraps the statically typed lambda in a class with a virtual invoke function that calls the lambda’s call operator and either embeds the lambda (via a move or copy constructor) in the object or a separate heap allocation.

                    The only things that use RTTI in C++ are exceptions (which dynamically map the thrown type to one of the caught types), dynamic_cast and a dynamic typeid statement. If you don’t use exceptions, then that just leaves dynamic_cast and typeid.

                    Most modern C++ codebases avoid dynamic_cast because it’s very slow and you can get better and faster code with an explicit virtual cast method for the classes that actually need it. The only place where this is difficult is diamond inheritance (moving from one branch to another) and dynamic cast is very slow there (and it’s usually a bad idea).

                    There are also problems with typeid. It returns a std::type_info object, which has a name method that returns a char*. The contents of this string are implementation defined (though it must be unique), but the Itanium ABI specifies that it is the managed type encoding. This means that you end up with some very large strings embedded in binaries. You often see 20% of the total binary size of a C++ library made up of type info strings, which is the main reason that you’d want to disable them. Personally, I’d love to see an ABI that replaced them with 64-bit integers formed from a cryptographic hash of the mangled name and emitted a map from integer value to string in a separate section that could be stripped in release builds.

              2. 1

                You’re right, I couldn’t explain myself very well.