I generally like the runtime-level decisions Go makes, e.g. instead of object headers there are “fat pointers” (interface values) instantiated specifically when you want to do something dynamic, the use of interfaces rather than hierarchies. Despite the throughput issues vs. other GC’s, I like that Go’s allows internal pointers, avoids stopping the world for long, and doesn’t require a read barrier.
One cool thing Erlang/BEAM did that Go didn’t is separate shared and per-process local/private heaps at the runtime and language level.
In a language like Go that would probably involve a type qualifier like shared, and when something needed to be shared that wasn’t yet (or something shared needed to become private to a thread) you could copy it, with the compiler sometimes able to move where an allocation is done as an optimization. (There’s a loose analogy to creating pointers vs. values and stack/heap.)
It would help with a couple things. Most important, it’s a path towards safer concurrency. There are various approaches: Rust-like rules that accesses to the shared heap must be explicitly guarded, something more implicit (with some risk of locking not working how you meant), or at least patch up the ways race conditions cause type and memory unsafety today (e.g. write type/pointer and length/pointer pairs to the shared heap with atomics like x86’s cmpxchg128) like the JVM does. If you don’t build static concurrency safety into the language, a shared qualifier could at least possibly help static analysis and let dynamic checkers slow down fewer accesses. (If this sort of stuff sounds interesting you might like “smaller Rust” blog posts (1, 2) though it’s definitely its own idea only loosely related.)
It would also open up some options for GC. With a rule that shared data can’t point to any thread’s private data, you could revive Go’s ‘request-oriented collector’ idea (quick collection of data private to one thread when it quits), do per-thread collections that don’t have to worry about concurrent accesses, or even do moving or generational collection for local data. All that works because stopping one thread isn’t stopping the world; threads pause all the time. You could keep Go’s existing non-generational approach on the shared heap, but its performance could benefit when local allocations no longer factor into global GC rate. (Or you could go for a design like Java’s ZGC with a read barrier, but man, seems even harder than what Go does!)
I realize these things get vastly more complicated once you get into details. (How do you not get your lunch eaten by private<->shared copies and synchronization accessing the shared heap? What on earth is that “something more implicit” to semi-safely access shared stuff? etc.) And Go is Go and BEAM is BEAM and you can’t just order up a mix of the two. But the bulk of the Go runtime model with an explicit shared/private distinction tacked seems like a neat spot in the design space that I haven’t seen explored. If there are existing examples I don’t know of it would be neat to hear about ’em!
I haven’t read about it in a while, but what you’re describing sounds similar to the way references work in Pony. They have lots of great papers about how their GC works, but here’s a start to the different kinds of references: http://jtfmumm.com/blog/2016/03/06/safely-sharing-data-pony-reference-capabilities/
(Pony contributor here) Like @Pentlander says, Pony checks several of these points:
It enforces write-uniqueness (among other things) across actors using reference capabilities, which allows you to “move” data across actors without any copying–most things are pass-by-reference.
Each actor has its own heap, so GC can happen independently.
There’s a talk comparing Pony to Erlang here: https://www.youtube.com/watch?v=_0m0_qtfzLs if you want to take a look. If you have any questions, you can post them here, or take a look at the community page, join, and asks question there!