1. 22

  2. 4

    I have yet to see a GC’d language written in rust that doesn’t cop out and use Rc .

    1. 3

      What’s wrong with using Rust’s Rc standard library feature to implement garbage collection? You gotta write the garbage collector in some non-garbage-collected language after all, and while I’m not an expert in GC implementation, I would think that any GC is going to need some kind of reference counting semantics. So why not use the one that’s built into the Rust standard library?

      1. 2

        What’s wrong with using Rust’s Rc standard library feature to implement garbage collection?

        Most reference counting implementations are slower than a moving, tracing GC alternative for a couple of reasons: they have to increment/decrement the count on every allocation/deallocation; and, with Rc in particular, they require two extra machine words (for the strong and weak count), which results in much poorer cache behaviour.

        The other traditional problem with reference counting is that it can’t deal with cycles without the user annotating weak pointers. It’s easy to get this wrong, so memory leaks can happen if you’re not careful.

        I would think that any GC is going to need some kind of reference counting semantics

        A tracing GC (like mark-sweep; or stop-and-copy) doesn’t need a reference count. They trace from a root-set (usually on the stack or in registers) during collection looking for live objects. Unreachable objects can then be considered dead and thus their memory is freed. (See https://en.wikipedia.org/wiki/Tracing_garbage_collection)

      2. 1

        To be fair, CPython uses reference counting (with a tracing GC just to collect cycles).

        1. 2

          I think the point is, that it’s really hard to write proper GC in Rust.

          1. 2

            No harder than in C, if your willing to use raw pointers. There are a couple of Rust GCs that you could probably use off the shelf, though.

            1. 1

              Why is that?

              1. 2

                Existing approaches to GC in Rust which don’t have language support suffer from at least one of two problems: i) their API makes them too impractical to use; ii) their performance is worse than Rc. In some cases, it’s a bit of both :).

                At this stage, I think fully tracing GC in Rust with acceptable performance for writing language interpreters would require some form of Rust language support. This has its own interesting challenges. You’ll encounter fairly non-language-specific problems when trying to retro-fit GC support: how to identify roots and interior pointers; whether to use a separate heap for GC objects; how the API will look; etc.

                But what makes Rust difficult in particular are its additional borrow rules and safety guarantees. How safe should the GC be? I’m guessing the answer is “well if I use safe Rust code, I don’t expect to segfault”. Then the GC needs to be deal with things like this:

                let s = S{}; // Assume S is some GC object
                let s_obfuscated_ptr = &s as *const S as *const usize;

                Casting a raw pointer (to a potential GC item) to some other type can be written in safe Rust code. Dereferencing it cannot. There’s a problem here though, we’ve potentially hidden a GC root from our collector, and this could introduce unsoundness.

                Another problem is the borrow rules. Multiple mutable references in Rust is UB. No exceptions. So you would still need something like RefCell inside your GC managed container to enforce those rules at runtime. This does some ref counting under the hood so you’d still, in some sense, have to deal with the reference count penalty. It’s important that this invariant in Rust isn’t broken - at present, I’m not aware of any optimisations rustc performs based on the mutability of references - but in the future, who knows? Breaking this invariant could lead to unsound behaviour based on assumptions made by the optimiser later on down the line.

                The Deref trait also adds complexity because a GC needs to maintain the frozen reference property: any active reference &'a T to an object, must have a constant address for the whole of its lifetime a. If a collection takes place while an object is being dereferenced, the collector can’t move the object without breaking the FRP. From what I can remember, a lot of rustc’s safety relies on this invariant, and a GC would need to ensure that it pins objects across a deref to ensure soundness.

                There are lots of other creases that would need ironing out for a sound Rust GC API. There’s a few resources on this from back when some of the core team were actively working on it but I think the work has now stagnated. This blog is a good start if you wanted to find out more: https://manishearth.github.io/blog/2016/08/18/gc-support-in-rust-api-design/

            2. 1

              Sure, that is fair, but from what I have seen is that rust ownership semantics make implementing mutability + sweeping part very hard. I have seen a rust GC ref type before, just never seen it used in a rust interpreter.

              edit: jake hughes comment above is an excellent summary.

          2. 0

            It sounds like an interesting article but basically none of the code snippets work without JS turned on, something I’m not willing to do.

            1. 3

              The source is all available here: https://gist.github.com/stopachka

              1. 1