Is it possible to set floating point number’s bits positionally? e.g. set position 11-63 to random?
Understanding the constraint 0.0 <= n < 1.0, what is the utility of a 53 bit random floating point number?
Why not use a random 64 bit string?
How is a random 0.0 <= n < 1.0 float useful?
Given the constraint 0.0 <= n < 1.0, how could the statement from the article be true? “Note that this solution returns 53 bits of randomness”. Isn’t the max float number for the 0.0 <= n < 1.0 constraint 52 bits?
Which then follows (which I assume would be answered by the previous two questions) why waste 11 bits?
One reason is that many distributions can be sampled based on their quantile function. If you can generate a uniformly random number between 0 and 1, you can put that into another distribution’s quantile function to get a sample from the other distribution.
The utility of 0.0 <= n < 1.0 is that it’s a building block for creating other floats. If you want to pick a random element from a 26 element array, floor(rnd() * 26) gives you an index into that array, in a way that is pretty much “fair” (uniform).
Contrast with all 64 bits being random. That produces a float that goes from -Infinity to +Infinity: this is non-uniform (numbers close to zero are more common) and harder to work with (you can’t reduce the range by dividing by Infinity). And some of your floats will be NaNs, which are kind of useless for picking a random index.
with more steps? The primitive we use here is “get random 64 bits”, round-tripping that through float seem unnecessary (and I would maybe expect the % to be less biased, because with the float we throw away 53 bits of entropy).
I think you’re right, and there’s better examples for the utility of random floats. I also think I misunderstood “64-bit string” as “f64 whose bits are uniformly random”.
floor(rnd() * 26) is also biased: it’s impossible to go from a uniformly random bit to a uniformly random number less than 26 in a finite number of steps.
Yes, if you have a discrete function, it is probably simpler just to sample the index as an integer.
But if you have a real-valued random variable X and can compute or estimate its CDF as F[X], then a uniform sampling of F^-1[X] (the inverse CDF, which is defined between 0.0 and 1.0) is equivalent to a sample from the distribution of X itself.
The answers to 2 and 3 are a bit (heh) subtle, and I should add some notes about them to my blog post.
The 52 vs 53 thing comes from how integers are converted to fp. The algorithm implemented by the FPU (for nonzero unsigned numbers) is
count leading zeroes, to get z
the exponent is 63 - z and the biased exponent is 1086 - z
shift the number left by z - 11 so the top bit is in position 52 (very large numbers get shifted right)
clear bit 52 (which at this point is always 1) and insert the biased exponent into bits 52…62 (inclusive)
So the 53rd bit of randomness ends up encoded in the exponent of the result.
Now, there are in fact 2^62 double precision floats between 0.0 and 1.0, but they are not distributed uniformly: the smaller ones are much denser. Because of this, there are two ways to model a uniform distribution. My post uses a discrete model, where the functions return one of 2^52 or 2^53 evenly spaced numbers.
You can also use a continuous model, where you imagine a uniformly random real number and return the closest floating point result. This can be done with a special arbitrary precision version of the integer conversion algorithm I wrote above.
if your random integer is 0, add 64 to z and get another random number. If z reaches 1024, return 0. (Or do something more clever to handle denormal numbers)
if z mod 64 is greater than 11, get another random integer and use its bits to fill the low order bits of the number after it is shifted left
Yes, but only via the bit hacking or union stuff being done to set the mantissa at once so all you’re doing is making things more convoluted.
Large numbers of problems in graphics and modeling make use of random numbers in the [0,1) range because they’re modeling probabilities. You could do over a larger range but that wouldn’t give you any more precision in your randomness, and you’d introduce the need for division later on to compensate. Given the amount of operations being performed in these problems, adding a bunch of divisions for no actual gain seems unhelpful at best.
no? You have 53 bits of precision regardless of exponent
What are the bits you think are being wasted? You can’t randomize the sign bit for obvious reasons, and you can’t randomize the exponent: for obvious reasons you can’t make it too high, but if you randomize how negative it is you are creating a logarithmic rather than uniform distribution.
To fit 64 bits of entropy in a 64-bit float you would necessarily have to cover every single float value (from negative infinity to infinity plus all of the trillions of different NaNs). That’s distinctly un-useful when compared to a finite uniform distribution. 53 bits is exactly how many you can fit while uniformly covering a finite range using this technique (and [0,1) is as good a finite range as any other, it lends itself easily to rescaling and inverse-transform sampling).
It really is 53 because [0,1) includes the denormals; they do add value because if you’re scaling up to some other range, some or all of them will be converted to “ordinary” floats.
See also:
Allen downey, Generating Pseudo-random Floating-Point Values
Marc Reynolds, Higher density uniform floats
Marc Reynolds, Basic uniform random floating-point values
This is neat. Thank you for the post.
After reading a few questions came to mind:
One reason is that many distributions can be sampled based on their quantile function. If you can generate a uniformly random number between 0 and 1, you can put that into another distribution’s quantile function to get a sample from the other distribution.
Yep, here are some example algorithms for generating numbers according to various distributions which I should have given some more comments and citations, oh well.
For the curious, this is called the “inverse transform sampling” method: https://en.wikipedia.org/wiki/Inverse_transform_sampling
The utility of 0.0 <= n < 1.0 is that it’s a building block for creating other floats. If you want to pick a random element from a 26 element array,
floor(rnd() * 26)
gives you an index into that array, in a way that is pretty much “fair” (uniform).Contrast with all 64 bits being random. That produces a float that goes from -Infinity to +Infinity: this is non-uniform (numbers close to zero are more common) and harder to work with (you can’t reduce the range by dividing by Infinity). And some of your floats will be NaNs, which are kind of useless for picking a random index.
Isn’t
just
with more steps? The primitive we use here is “get random 64 bits”, round-tripping that through float seem unnecessary (and I would maybe expect the
%
to be less biased, because with the float we throw away 53 bits of entropy).I think you’re right, and there’s better examples for the utility of random floats. I also think I misunderstood “64-bit string” as “f64 whose bits are uniformly random”.
No, the point of this is to have an unbiased random value, and using modulus is biased when the there are an unequal number of each remainder; when the range of inputs isn’t divisible by the modulus - 2^64-1 % 26 = 7, so there are 7 values which will occur more often than the other 18. https://stackoverflow.com/questions/11758809/what-is-the-optimal-algorithm-for-generating-an-unbiased-random-integer-within-a has an elaborated example.
floor(rnd() * 26)
is also biased: it’s impossible to go from a uniformly random bit to a uniformly random number less than 26 in a finite number of steps.Yes, if you have a discrete function, it is probably simpler just to sample the index as an integer.
But if you have a real-valued random variable
X
and can compute or estimate its CDF asF[X]
, then a uniform sampling ofF^-1[X]
(the inverse CDF, which is defined between 0.0 and 1.0) is equivalent to a sample from the distribution ofX
itself.The answers to 2 and 3 are a bit (heh) subtle, and I should add some notes about them to my blog post.
The 52 vs 53 thing comes from how integers are converted to fp. The algorithm implemented by the FPU (for nonzero unsigned numbers) is
So the 53rd bit of randomness ends up encoded in the exponent of the result.
Now, there are in fact 2^62 double precision floats between 0.0 and 1.0, but they are not distributed uniformly: the smaller ones are much denser. Because of this, there are two ways to model a uniform distribution. My post uses a discrete model, where the functions return one of 2^52 or 2^53 evenly spaced numbers.
You can also use a continuous model, where you imagine a uniformly random real number and return the closest floating point result. This can be done with a special arbitrary precision version of the integer conversion algorithm I wrote above.
There is an implementation of this algorithm at https://mumble.net/~campbell/2014/04/28/random_real.c
In reality, the loop is vanishingly unlikely to ever be entered, so you can leave it out, resulting in code like https://gist.github.com/corsix/c21d715238e32ce6d44a7fc1b44f427d
Yes, but only via the bit hacking or union stuff being done to set the mantissa at once so all you’re doing is making things more convoluted.
Large numbers of problems in graphics and modeling make use of random numbers in the [0,1) range because they’re modeling probabilities. You could do over a larger range but that wouldn’t give you any more precision in your randomness, and you’d introduce the need for division later on to compensate. Given the amount of operations being performed in these problems, adding a bunch of divisions for no actual gain seems unhelpful at best.
no? You have 53 bits of precision regardless of exponent
What are the bits you think are being wasted? You can’t randomize the sign bit for obvious reasons, and you can’t randomize the exponent: for obvious reasons you can’t make it too high, but if you randomize how negative it is you are creating a logarithmic rather than uniform distribution.
Thank you for the response!
There’s a link in the Twitter thread that goes into more details, and pointed out some other pitfalls when using floats for random numbers, specifically around 0 and 1.
https://mumble.net/~campbell/2014/04/28/uniform-random-float
To fit 64 bits of entropy in a 64-bit float you would necessarily have to cover every single float value (from negative infinity to infinity plus all of the trillions of different NaNs). That’s distinctly un-useful when compared to a finite uniform distribution. 53 bits is exactly how many you can fit while uniformly covering a finite range using this technique (and [0,1) is as good a finite range as any other, it lends itself easily to rescaling and inverse-transform sampling).
It really is 53 because [0,1) includes the denormals; they do add value because if you’re scaling up to some other range, some or all of them will be converted to “ordinary” floats.
For higher precision on the smaller numbers:
[0.5, 1.0)
You can use
__builtin_clzll()
to make that loop run 64 coin flips per iteration :-)Yep. [Handwaving:] Instead of dividing by two each iteration, just decrement the biased exponent by that amount. Then pair that with the mantissa.