For those who, like me, didn’t get it by just reading the abstract: Basically, the assumption is that the last three bits of the mantissa are almost always 000. This covers a lot of values, including all integer valued floats below 2^50 + 1. If we assign the tag 000 for unboxed floats, then all these numbers already have the correct tag right within their representation, without any additional bits or allocations. What do we do for all the other floats? Well, those have to be boxed, i.e. allocated on the heap and accessed via pointer.
From the paper:
This section describes self-tagging, a new tagging technique that exploits the fact that some values naturally contain the appropriate tag corresponding to their type, at the correct location, in their bit arrangement.
Self-tagging exploits such occurrences where the tag bits of a pointer appear in its value. Such objects can be unboxed, making them tagged values instead of heap allocated value. However, since only 1/8 of all floats can be unboxed in such a way, a second tag must be reserved for the remaining floats, which still need to be represented as heap allocated values with either tagged or generic pointers.
I’m wondering if this could be made dynamic. I’m dealing with integers representing time in nanoseconds from epoch and I want those in to be fast. I don’t care about floats at all. Could language adopt to the domain quickly?
I think this is maybe feasible with a JIT, but even then likely not worth it.
This would be very costly for the interpreter as instead of an instruction or two you need to read config out of memory and make a decision based on both the “pointer” and the config.
The config itself for what tags are in use needs to be computed.
Changing the config would likely be necessary (it takes some time to figure out what the best tag patterns are) and would be difficult (basically equivalent to a precise moving GC I think, you need to rewrite all existing pointers)
This is a really cool approach. Seems to almost always outperform NaN tagging. I wonder if you could modify this tagging scheme to use a special value of for ±0.0 so you didn’t have to use an entire tag just for zero floats.
funny. but manual pointer representation research (one representation for all!) is a dead end. (not a diss, just saying.) the future is smarter unboxing and (manually or automatically) application-tuned representations. it seems extremely implausible that any point in the representation space that’s ‘optimal on average across a wide variety of js benchmarks’ is unlikely to be ideal for any particular application
also deemphasis of floats. no one should be allowed to use a floating point unless they can prove they could pass a first-year numerical analysis course.. (but again even if you are using floats, it’s actually very uncommon to have a scenario where you have a single value that you’re statically unsure if it’s a float or not but you care about the difference)
also not particularly novel representation—i discussed it a bit with gilbert baumann a few years ago and mentioned it in this footnote here. (again not a diss—trying it out, running the benchmarks, and publishing the paper is worthwhile.) we didn’t go anywhere with it in part because, per above, floats meh
This comment isn’t helpful, is weirdly negative, and has weird claims on top.
Last time I pointed this out, you said you “hadn’t eaten” and rewrote your comment. You might want to do the same here, and deal with personal problems before writing on lobste.rs …
I flagged the comment as “Unkind”, so I’ll explain my reasoning for that:
You start by dismissing the research based on “it’s a dead end”, yet don’t provide any solid explanation as to why that is. Instead, you provide some really vague opinions.
When people write “I don’t mean to be X, just saying” or something to that degree, it’s commonly a thinly veiled excuse to get away with being an ass (“I’m not mean, I’m just saying!” for example). Even if it’s not meant that way, it adds no value
You make a remark about how one shouldn’t be allowed to use floats until meeting some criteria. This is completely irrelevant to the topic at hand, and just unnecessary gatekeeping
You then dismiss the work based on “it’s not novel, I discussed it in the past”, overlooking the fact that research is more than just coming up with a clever idea. It’s like saying the iPhone isn’t novel because you thought about making a similar phone several years before the first iPhone came out
To summarize it in a somewhat blunt manner: your comment is dismissive and adds zero value to the discussion.
obvious hyperbole just gesturing in a particular direction. that said it is worth noting on the second point that a) grade school math as it is currently taught often seems to leave people with strange misconceptions, b) the problems people are assigned are generally designed so as to be numerically well behaved when worked on a calculator (where there is no reason to assume a priori this holds in the general case). and on the first that strange memes and superstitions (eg 0.1 + 0.2, various tolerant comparisons) proliferate without being understood and this seems very likely to cause problems
And the big one, no one should be allowed to use a dynamically typed language (like Bigloo) unless it is optimized using a JIT compiler fine tuned using thousands of engineer-years worth of work, funded by a huge tech giant like Google.
Boxed value representations for dynamically typed languages are easy to implement and widely used. The alternatives you suggest sound very complex and difficult to implement. The performance of boxed values is just fine for a wide range of use cases. I use boxed values in my Curv language, and their performance is fine; the actual problems with Curv that need fixing are elsewhere. I use NaN boxing because Curv is used for 3D modelling and 3D printing, and all numbers are 64 bit floats, as is appropriate for this domain. Yes, 0.1+0.2 != 0.3, but it doesn’t matter; trigonometry and linear algebra has to be done with floating point for performance reasons, and the approximate results are good enough for this domain. Nobody needs to be a numeric analyst to use Curv for 3D modelling. The issues you raise seem more relevant to academic research and what kind of intellectually difficult and challenging work you need to do in order to get published, and less relevant to the actual pragmatic issues I faced in making a usable tool for doing practical work. Simplicity of implementation is an important goal for this kind of project.
Hm this is cool, I have been wondering how the space of doubles is used in practice
i.e. it’s obvious that most integers are small, but it’s not obvious with doubles
@tekknolagi pointed me to this OpenSmalltalk post from 2018 a few months ago, with a
100tag for immediate floats, and then 61 bits of datahttps://clementbera.wordpress.com/2018/11/09/64-bits-immediate-floats/
Not sure exactly how it relates, but it may have some similar properties
Some other related posts:
Clasp - https://drmeister.wordpress.com/2015/05/16/tagged-pointers-and-immediate-fixnums-characters-and-single-floats-in-clasp/ - I think he didn’t come to a good conclusion about floats, at least in 2015
https://abchatra.github.io/TaggedFloat/
For those who, like me, didn’t get it by just reading the abstract: Basically, the assumption is that the last three bits of the mantissa are almost always
000. This covers a lot of values, including all integer valued floats below 2^50 + 1. If we assign the tag000for unboxed floats, then all these numbers already have the correct tag right within their representation, without any additional bits or allocations. What do we do for all the other floats? Well, those have to be boxed, i.e. allocated on the heap and accessed via pointer.From the paper:
I’m wondering if this could be made dynamic. I’m dealing with integers representing time in nanoseconds from epoch and I want those in to be fast. I don’t care about floats at all. Could language adopt to the domain quickly?
I think this is maybe feasible with a JIT, but even then likely not worth it.
This is a really cool approach. Seems to almost always outperform NaN tagging. I wonder if you could modify this tagging scheme to use a special value of for ±0.0 so you didn’t have to use an entire tag just for zero floats.
funny. but manual pointer representation research (one representation for all!) is a dead end. (not a diss, just saying.) the future is smarter unboxing and (manually or automatically) application-tuned representations. it seems extremely implausible that any point in the representation space that’s ‘optimal on average across a wide variety of js benchmarks’ is unlikely to be ideal for any particular application
also deemphasis of floats. no one should be allowed to use a floating point unless they can prove they could pass a first-year numerical analysis course.. (but again even if you are using floats, it’s actually very uncommon to have a scenario where you have a single value that you’re statically unsure if it’s a float or not but you care about the difference)
also not particularly novel representation—i discussed it a bit with gilbert baumann a few years ago and mentioned it in this footnote here. (again not a diss—trying it out, running the benchmarks, and publishing the paper is worthwhile.) we didn’t go anywhere with it in part because, per above, floats meh
This comment isn’t helpful, is weirdly negative, and has weird claims on top.
Last time I pointed this out, you said you “hadn’t eaten” and rewrote your comment. You might want to do the same here, and deal with personal problems before writing on lobste.rs …
how is it negative? what claims do you find weird (and what is wrong with making ‘weird’ claims)?
I flagged the comment as “Unkind”, so I’ll explain my reasoning for that:
To summarize it in a somewhat blunt manner: your comment is dismissive and adds zero value to the discussion.
Damn, time to go delete all of my perfectly functioning float code, I guess.
[Comment removed by author]
obvious hyperbole just gesturing in a particular direction. that said it is worth noting on the second point that a) grade school math as it is currently taught often seems to leave people with strange misconceptions, b) the problems people are assigned are generally designed so as to be numerically well behaved when worked on a calculator (where there is no reason to assume a priori this holds in the general case). and on the first that strange memes and superstitions (eg 0.1 + 0.2, various tolerant comparisons) proliferate without being understood and this seems very likely to cause problems
what do you mean?
Boxed value representations for dynamically typed languages are easy to implement and widely used. The alternatives you suggest sound very complex and difficult to implement. The performance of boxed values is just fine for a wide range of use cases. I use boxed values in my Curv language, and their performance is fine; the actual problems with Curv that need fixing are elsewhere. I use NaN boxing because Curv is used for 3D modelling and 3D printing, and all numbers are 64 bit floats, as is appropriate for this domain. Yes, 0.1+0.2 != 0.3, but it doesn’t matter; trigonometry and linear algebra has to be done with floating point for performance reasons, and the approximate results are good enough for this domain. Nobody needs to be a numeric analyst to use Curv for 3D modelling. The issues you raise seem more relevant to academic research and what kind of intellectually difficult and challenging work you need to do in order to get published, and less relevant to the actual pragmatic issues I faced in making a usable tool for doing practical work. Simplicity of implementation is an important goal for this kind of project.