Good article! I remember reading that paper a while ago.
My intuition is that this approach doesn’t have enough benefits over full ref-counting to be worth it. You still have to inc/dec the ref-count when passing non-owning references around. You don’t when passing the owning ref, but that’s a move operation, and full ref-counting doesn’t modify ref-counts on move either.
The paper says you can elide a bunch of retain/release calls, but the Lonster language showed how to do that with full ref-counting, and that was widely adopted by languages like Nim.
I think the interesting thing about this approach is a correct program remains correct if you indiscriminately elide the reference counting (and the storage for the count) altogether, since changes to the ref count don’t affect the lifetime of the object. In that sense it is more like ASan/LSan than GC/borrow checking. I can imagine a “release-overconfident” build configuration that boils the two pointer types down to T* and unique_ptr<T>.
On Mastodon, a Chrome dev pointed out to me that their dangling pointer detection uses a similar(ish) scheme. Also pointed to two design docs that hadn’t been released publicly before that I need to take the time to read and digest.
Changes to the refcount not affecting the lifetime of the object does also reduce the code needed to perform the count (just a decrement rather than decrement+branch if zero). Beyond benefits from reduced code size, it’s not clear that’s a meaningful improvement on a high performance core given the branch is probably predictable.
Thanks for posting, I’m interested in this history to inform my own experiments using ref counting.
On reflection, I don’t see myself using this system. The two kinds of references (owning and non-owning) remind me of strong and weak references in other ref-counting and garbage-collecting schemes. I wouldn’t want to use “non-owning” or “weak reference” semantics except in those use cases where you have a semantic requirement for an object to go away while someone is still referring to it. If you are forced to use weak references outside of those legitimate use cases, which are relatively rare, I would expect this to be “actually quite a pain to work with” due to the reference failing, or even your program being aborted, for reasons that don’t arise organically from the program semantics. Other ref-counting and GC systems with strong and weak references allow multiple strong references to an object.
To put it another way, in my language I’m interested in using multiple-ownership semantics for objects that represent immutable, eternal values, and single-ownership for objects that represent resources, which are mutable entities that exist in time and have a beginning and end.
Regular refcounting will keep the object alive until it has no references. This does a very similar operation but is just a check. So instead of extending the lifetime it will crash the program.
I can think of a few scenarios where this is useful:
This can protect objects that aren’t heap allocated. For example wrap a stack pointer in this checker.
Where the ending of an objects lifetime is important. Extending the lifetime would cause bugs, so it is better to crash.
For use as a form of sanitizer, as code that is correct with this scheme (and doesn’t crash) will be correct even if the checking is disabled.
I think the implementation is basically identical to regular reference counting such as the following Rust:
let owning = std::sync::Arc::new(some_value);
// Some users which can share as needed.
do_foo(owning.clone());
do_bar(owning.clone());
do_baz(owning.clone());
assert_eq!(std::sync::Arc::strong_count(&owning), 1); // Check we are the only remaining reference.
std::mem::drop(owning); // Destruct object.
ARC is always correct, but this leaves prevention of use-after-free to the programmer. However, UAF with this is safer than with plain pointers, because it aborts instead of creating an exploitable gadget.
ARC omits refcounts opportunistically. This doesn’t optimize, but hopes to make refcounting a little bit cheaper. In theory it could optimize out some refcount changes similarly to ARC.
Good article! I remember reading that paper a while ago.
My intuition is that this approach doesn’t have enough benefits over full ref-counting to be worth it. You still have to inc/dec the ref-count when passing non-owning references around. You don’t when passing the owning ref, but that’s a move operation, and full ref-counting doesn’t modify ref-counts on move either.
The paper says you can elide a bunch of retain/release calls, but the Lonster language showed how to do that with full ref-counting, and that was widely adopted by languages like Nim.
I think the interesting thing about this approach is a correct program remains correct if you indiscriminately elide the reference counting (and the storage for the count) altogether, since changes to the ref count don’t affect the lifetime of the object. In that sense it is more like ASan/LSan than GC/borrow checking. I can imagine a “release-overconfident” build configuration that boils the two pointer types down to T* and unique_ptr<T>.
On Mastodon, a Chrome dev pointed out to me that their dangling pointer detection uses a similar(ish) scheme. Also pointed to two design docs that hadn’t been released publicly before that I need to take the time to read and digest.
Changes to the refcount not affecting the lifetime of the object does also reduce the code needed to perform the count (just a decrement rather than decrement+branch if zero). Beyond benefits from reduced code size, it’s not clear that’s a meaningful improvement on a high performance core given the branch is probably predictable.
Thanks for posting, I’m interested in this history to inform my own experiments using ref counting.
On reflection, I don’t see myself using this system. The two kinds of references (owning and non-owning) remind me of strong and weak references in other ref-counting and garbage-collecting schemes. I wouldn’t want to use “non-owning” or “weak reference” semantics except in those use cases where you have a semantic requirement for an object to go away while someone is still referring to it. If you are forced to use weak references outside of those legitimate use cases, which are relatively rare, I would expect this to be “actually quite a pain to work with” due to the reference failing, or even your program being aborted, for reasons that don’t arise organically from the program semantics. Other ref-counting and GC systems with strong and weak references allow multiple strong references to an object.
To put it another way, in my language I’m interested in using multiple-ownership semantics for objects that represent immutable, eternal values, and single-ownership for objects that represent resources, which are mutable entities that exist in time and have a beginning and end.
Interesting. How does this compare to ARC in Objective C/Swift?
Regular refcounting will keep the object alive until it has no references. This does a very similar operation but is just a check. So instead of extending the lifetime it will crash the program.
I can think of a few scenarios where this is useful:
I think the implementation is basically identical to regular reference counting such as the following Rust:
But again the intent is different.
ARC is always correct, but this leaves prevention of use-after-free to the programmer. However, UAF with this is safer than with plain pointers, because it aborts instead of creating an exploitable gadget.
ARC omits refcounts opportunistically. This doesn’t optimize, but hopes to make refcounting a little bit cheaper. In theory it could optimize out some refcount changes similarly to ARC.