1. 20
  1. 4

    Weak refs are indeed weird, and harder to implement than you’d naively expect.

    The Obj-C approach is especially weird, but the people who work on that part of the codebase at Apple have a history of writing tight code, so I have no doubt they profiled various approaches. They have generally tried hard to avoid bloating the size of the root class, since this is bad for performance. The weak-ref table may be using some efficient lock-free algorithm.

    We implemented generational references, and found them to be at least 2.3x faster than reference counting!

    This confuses me, since generational refs are not a replacement for ref-counting or other GC; they don’t tell you when it’s OK to free an object. I think what this sentence means is that generational refs were faster than ref-count-based implementations of weak-refs like Swift’s. I can’t tell for sure because the link in that sentence is broken. :(

    1. 3

      Weak references are a relatively late addition to obj-c, so any implementation needed to be binary compatible with existing code. The side table approach avoids breaking existing layout, and has no cost in the common case of there not being any weak references.

    2. 3

      In CHICKEN, weak references are relatively simple and elegant. There are two types of objects which can refer to other objects weakly: Locatives (which are essentially safe pointers, registered in a central table) and “weak” pairs, which are specially marked pairs whose car is taken as a weak reference and can only be created internally from the symbol table (also registered in a central place).

      (For reference, CHICKEN’s GC is a copying collector) When we do a minor GC, we treat weak references as strong references for performance reasons, the target objects are kept alive even if there’s nothing else referencing them. When doing a major GC, initially skip over weak references so that if nothing else points to these objects we can erase the references to them. Then, we walk the symbol table(s) and the locative tables and do one of two things:

      • If the object pointed to by the weak pair’s car or the locative is a “forwarding object”, the object got moved (because something else points to it), and we simply chase the forwarding pointer and update the reference to the new location.
      • If it’s not a forwarding pointer, we simply clear the reference. For weak pairs, we take the pair out of the symbol table’s linked list and for locatives we use a placeholder “undefined” value that indicates the object has vanished.
      1. 1

        I get the impression this is one of the problems that’s generally easier to solve with tracing GC than with active reference counting - the Swift, ObjC, C++ shared_ptr/weak_ptr, and Rust Rc/Weak cases explained in the article are all examples of the latter.

        I should probably look into whether anyone has managed to build a Rust library/extension/feature/🤷🏻‍♂️ for non-universal tracing GC. Basically so you can opt into tracing GC for an arbitrary enclave within a Rust program. For some types of problem, straightforward borrowing semantics don’t map well, and refcounting certainly has its downsides - while also being fairly self-contained, so it doesn’t “infect” the rest of the program in the way that most runtimes with tracing GC do.

        1. 1

          It’s kind of funny that reference counting is considered to be the “easier” way to do GC, even though it’s incorrect and needlessly complicates stuff like this.

          1. 3

            Well, something like copying GC only really works in programming languages where object identity is not tied to memory address. Tracing GC is generally only “easy” in managed runtimes where the GC can make certain assumptions. Reference counting makes no such demands.

            1. 1

              Fair point!