1. 8

  2. 5

    Define your destructors outside the class declaration so they don’t get inlined by the compiler in both functions — the caller and the callee that returns the optional — to avoid binary size increase.

    As we recently found out, that’s a problem for Rust as well: https://github.com/rust-lang/rust/issues/88438

    1. 3

      Interesting. Both langauages have similar challenges.

      One huge advantage of Rust is that the compiler can statically determine a destructor is not necessary after moving a value. That helps with the problem I’ve described in the article.

      1. 1

        AFAIK the code to statically determine whether a destructor needs to be run after a move in Rust is:

        return false;
    2. 1

      Define your destructors outside the class declaration so they don’t get inlined by the compiler in both functions

      I don’t believe that trick will work when building with LTO, since then any function is eligible for inlining, whether or not it’s in a header.

      I generally build releases with both LTO and -Os; the latter tames excessive inlining.

      1. 2

        When performing the LTO, the compiler (or linker) has a global view of the code and can decide not inlining a destructor that’s called in many places. As you said, -Os helps with that.

      2. 1

        Like you said, this all leads back to the problem that move constructor implementations can vary in how expensive they are to call. Google protobuf performing a full copy if other arena id != this arena id is surprising to me, although it does comply with the C++ standard. It would be better if they reassigned ownership in their Arena allocator. Does anyone know of an easy way to instrument C++ move constructors to detect how expensive they are? If there is a solution with static analysis that would be even better. It would also be helpful if we could manually decorate a move constructor that is expensive in such a way that std::optional warns us if it calls the expensive move.

        A crude way to do this would be to define an interface that has a method bool moveIsExpensive() and log when std::optional is used with an expensive move constructor. And by making a wrapper myproj::optional:

        namespace myproj {
        // interface to describe whether move constructor is expensive
        class MoveCost {
        bool moveIsExpensive()=0;
        class Object : public MoveCost {
        Object(Object&& other) { ExpensiveCopy(other) }
        bool moveIsExpensive() { return true; }
        template <class T>
        class optional {
        std::optional<T> m_optional;
        optional(T o) : m_optional(o) {
          if (DEBUG) {
            auto expense = dynamic_cast<MoveCost>(&o);
            if (expense) {
              std::cerr << "warning at " << __file__ << ':': << __line__
                << ": myproj::optional calling an expensive move constructor." << std::endl;
        // delegate the rest to m_optional

        Then you could make a myproj::optional<Object> and get a warning at runtime.

        EDIT: As an aside - would it be better to prefer std::unique_ptr over std::optional for code that returns Google protobuf messages? That would only ever call a single std::move. Or are the ownerships semantics too restrictive for this use case?

        1. 2

          About your EDIT paragraph:

          I think that’s better for binary size, but I still think output parameters is the way to go because you should leave the decision of where to allocate the protobuf object to the caller.

          If the caller wants to use an Arena (like Google backends do), you make it impossible to use your function by allocating the object in the unique_ptr. Impossible without a copy which is undesirable and avoidable.

          What if the caller has that object from a previous iteration in a loop? Reusing protobuf objects reduces the number of memory allocations as google:: protobuf::ParseFromString reuses the allocated buffers when populating the message again.

          1. 2

            You could allow the allocator and/or deleter to be overloaded via template parameters of the function that returns the unique_ptr. But at that point the function signature would be pretty noisy. And the caller would have to provide said allocators and deleters. I agree that the output parameters is the way to go.

            1. 2

              Noisy and you would lose the ability to compile them separately — templates have to be declared and defined inline.