In practice you would probably just compile a version that was passed a pointer to the type information, because the type information gives you size, alignment, and pointer information all in one place with only a single argument.
But, just as a curiosity, I think you could do a copy with only a size. The only member besides size that the typedmemmove source accesses is ptrdata, which, though the name sounds super general, only says how far into the object you need to look to be sure you’ve found all the pointers. Using that instead of the object size here seems to be an optimization: if ptrdata is 1, for instance, the runtime can quit worrying about possible pointers in an object after the first word, and if it’s zero it needn’t scan at all. You could write memmove code to conservatively act as if any word of the object might be a pointer, you’re just potentially wasting some effort.
The detailed data about which words of the allocation have pointers/need scanning comes from a GC bitmap that’s set up at allocation time. (You can just use an address to look a word up in this bitmap.) But that means that to allocate you need pointer/(no)scan information to set the bits. If allocating just to copy data you could in theory copy the GC bitmap from source to dest before you copy the data, but you’d still need the type’s alignment to get a properly aligned slot in memory and…yeah, maybe at that point we just pass a type pointer around instead.
This all makes me wonder what choices the team will make about compilation of generics: max speed of compiled code (by compiling as many optimized versions of the code as needed) vs. a dynamic implementation to avoid hurting compile time or binary size (so the resulting machine code looks like if you’d used interfaces). I can see the case for either: maybe these are a specialized tool for max performance for sorts, collections, etc. or maybe they’re mostly to make source better-checked and clearer. Or maybe we start with the dynamic approach (possibly quicker to implement?) then tune the generated output over future releases. Haven’t followed discussions super closely; if someone knows what has been said about this I’m interested.
Yeah I wonder if there will be any implementation problems due to the combination of monomorphized generics and a potential explosion of GC bitmaps per type.
I think most of the languages monomorphized generics like C++ and Rust don’t have GC. Although I guess D is an exception. Not sure what they do exactly, but it probably helps that they have their own back end and not LLVM.
Not sure what C# does either. I think it has more VM support.
Besides reducing the code bloat and avoiding the need for a special intermediate representation of compiled but unspecialized generics, the dynamic approach has the added benefit (at least from the POV of Go’s goals) that it discourages excessively fine-grained abstractions (e.g., how Arc and Mutex have to be separately applied to get Arc<Mutex<T>> in Rust), because it would have too much runtime overhead.
I think where it ends up is right:
But, just as a curiosity, I think you could do a copy with only a size. The only member besides size that the
typedmemmove
source accesses isptrdata
, which, though the name sounds super general, only says how far into the object you need to look to be sure you’ve found all the pointers. Using that instead of the object size here seems to be an optimization: ifptrdata
is 1, for instance, the runtime can quit worrying about possible pointers in an object after the first word, and if it’s zero it needn’t scan at all. You could write memmove code to conservatively act as if any word of the object might be a pointer, you’re just potentially wasting some effort.The detailed data about which words of the allocation have pointers/need scanning comes from a GC bitmap that’s set up at allocation time. (You can just use an address to look a word up in this bitmap.) But that means that to allocate you need pointer/(no)scan information to set the bits. If allocating just to copy data you could in theory copy the GC bitmap from source to dest before you copy the data, but you’d still need the type’s alignment to get a properly aligned slot in memory and…yeah, maybe at that point we just pass a type pointer around instead.
This all makes me wonder what choices the team will make about compilation of generics: max speed of compiled code (by compiling as many optimized versions of the code as needed) vs. a dynamic implementation to avoid hurting compile time or binary size (so the resulting machine code looks like if you’d used
interface
s). I can see the case for either: maybe these are a specialized tool for max performance for sorts, collections, etc. or maybe they’re mostly to make source better-checked and clearer. Or maybe we start with the dynamic approach (possibly quicker to implement?) then tune the generated output over future releases. Haven’t followed discussions super closely; if someone knows what has been said about this I’m interested.Yeah I wonder if there will be any implementation problems due to the combination of monomorphized generics and a potential explosion of GC bitmaps per type.
I think most of the languages monomorphized generics like C++ and Rust don’t have GC. Although I guess D is an exception. Not sure what they do exactly, but it probably helps that they have their own back end and not LLVM.
Not sure what C# does either. I think it has more VM support.
.NET generics use monomorphization for value types and a shared instantiation for reference types.
The new() constraint is handled with reflection.
This might provide useful background on the topic.
I believe they were intentionally careful not to specify so that they could experiment & potentially offer multiple compile-time options.
Yes, the design document’s implementation section explicitly leaves the question open (and is worth reading).
Curious what they do!
Besides reducing the code bloat and avoiding the need for a special intermediate representation of compiled but unspecialized generics, the dynamic approach has the added benefit (at least from the POV of Go’s goals) that it discourages excessively fine-grained abstractions (e.g., how
Arc
andMutex
have to be separately applied to getArc<Mutex<T>>
in Rust), because it would have too much runtime overhead.