I've been auditing a lot of JavaScript source code lately, and a common problem I'm seeing when generating random numbers is using the naive "multiply-and-floor" method. Because the "Math.random()" function call returns a number between 0 and 1, not including 1 itself, then developers think that the "best practice" for generating a random number is as follows:

1 2 3 | function randNumber(range) { return Math.floor(Math.random() * range); // number in the interval [0, range). } |

The problem with this approach is that it's biased. There are numbers returned that are more likely to occur than others. To understand this, you need to understand that Math.random() is a 32-bit RNG in Chrome and Safari, and a 53-bit RNG in Edge and Firefox. First, let's pretend every browser RNG is a 32-bit generator, then we'll extend it.

A 32-bit Math.random() means that there are only 2^{32} = 4,294,967,296 possible decimal values in the range of [0, 1). This means that the interval [0, 1) is divided up every "1/2^{32} = 0.00000000023283064365" decimal values. But that doesn't matter though, because if I wanted a random number between 1 and 100, 100 does not divide 4,294,967,296 evenly. I get 42,949,672 with 96 left over. What does this mean? It means that ...

1 | randNumber(100); |

... will return 0 through 95 more often than 96 through 99. That's our bias.

It doesn't matter if it's a 53-bit RNG either. "2^{53} = 9,007,199,254,740,992" is not a multiple of 100. Instead, dividing by 100, I get 90,071,992,547,409 with 92 left over. So, with a 53-bit RNG, 0 through 91 will be generated more often than 92 through 99 with the same function.

The only time this bias would not exhibit itself in the naive "multiply-and-floor" approach above, is if the random number requested is in the interval [0, 2^{N}), where "N" is any positive integer. 2^{32}, 2^{53}, and 2^{X}, where "X" is a positive integer, is always a multiple of 2^{N} (2^{N} divides 2^{X} evenly, when N ≤ X, N > 0).

So, what do we do? How do we improve the naive multiply-and-floor approach? Thankfully, it's not too difficult. All we need to do is essentially the following:

- Force the RNG into 32-bits (common denominator for all browsers).
- Create a range of values that is a multiple of our desired range (E.G.: 1-100).
- Loop over the range picking values until a value inside the range is generated.
- Output the generated value modulo our desired range.

Let's see this in practice. First the unbiased code, then the explanation:

1 2 3 4 5 6 7 | function uniformRandNumber(range) { var max = Math.floor(2**32/range) * range; // make "max" a multiple of "range" do { var x = Math.floor(Math.random() * 2**32); // pick a number of [0, 2^32). } while(x >= max); // try again if x is too big return(x % range); // uniformly picked in [0, range) } |

I know what you're thinking: **WAIT! YOU JUST DID THE "MULTIPLY AND FLOOR" METHOD!! HYPOCRITE!!!** Hold on though. There are two subtle differences. See what they are?

The "max" variable is a multiple of "range" (step 2 above). So, if our range is [0, 100), then "max = 4294967200", which is a multiple of 100. This means that so long as "0 < = x < 4294967200", we can return "x % 100", and know that our number was uniformly chosen. However, if "x >= 4294967200", then we need to choose a new "x", and check if it falls within our range again (step 3 above). So long as "x" falls in [0, 4294967200), then we're good.

This extends to cryptographically secure random numbers too. In action, it's just:

1 2 3 4 5 6 7 8 | function uniformSecureRandNumber(range) { const crypto = window.crypto || window.msCrypto; // Microsoft vs everyone else var max = Math.floor(2**32/range) * range; // make "max" a multiple of "range" do { var x = crypto.getRandomValues(new Uint32Array(1))[0]; // pick a number of [0, 2^32). } while(x >= max); // try again if x is too big return(x % range); // uniformly picked in [0, range) } |

So it's not that "multiply and floor" is *wrong* so long as you *use it correctly*.

One small caveat- these examples are not checking if "range" is larger than 32-bits. I deliberately ignored this to draw your attention on how to correctly generate uniform random numbers. You may or may not need to do various checks on the "range" argument. Is it an integer type? Is it a positive integer? Is it 32-bits or less? Etc.

As an exercise for the reader, how could you extend this uniform generator to pick a random number in the range of [100, 200)? Going further, how could you pick only a random *even number* in the range of [250, 500)?

## Post a Comment