I love this technique and have used it a few times in some smallish interpreter-like code I’ve written in the past few years.
However, it seems that OSs and CPUs are getting more picky about those unused bits in addresses (e.g. CHERI), so I’m assuming that NaN-boxing code is semi-fragile and may break in the future on some CPU architectures. I didn’t stop using it, but I made sure the few lines that do the pointer/NaN conversion are isolated in one place and carefully marked with comments so they can be updated when necessary.
Pointer tagging should still work just fine with CHERI IIRC, at least as long as you don’t use more bits for tagging then your struct is aligned to (else you couldn’t read it’s properties anyways). But yeah, boxing pointers inside NaN’s isn’t going to work with CHERI, but one way it could be worked around is by boxing an offset for a continuous array to access the object. Though, you would need the array allocated fully from the start, which might not work for all usecases.
Author here, glad to see the article still getting some attention almost two years later. May I ask where you came across it?
Conversation with some friends the other night, jogged my memory.
This is so great. The technique is wonderfully clever, and the explanation is so deftly done.
I love this technique and have used it a few times in some smallish interpreter-like code I’ve written in the past few years.
However, it seems that OSs and CPUs are getting more picky about those unused bits in addresses (e.g. CHERI), so I’m assuming that NaN-boxing code is semi-fragile and may break in the future on some CPU architectures. I didn’t stop using it, but I made sure the few lines that do the pointer/NaN conversion are isolated in one place and carefully marked with comments so they can be updated when necessary.
Pointer tagging should still work just fine with CHERI IIRC, at least as long as you don’t use more bits for tagging then your struct is aligned to (else you couldn’t read it’s properties anyways). But yeah, boxing pointers inside NaN’s isn’t going to work with CHERI, but one way it could be worked around is by boxing an offset for a continuous array to access the object. Though, you would need the array allocated fully from the start, which might not work for all usecases.