Introduction
The very first software algorithm to generate random numbers, was supposedly written in 1946 by John von Neumann, and is called the Middle Square Method, and it's crazy simple. Enough so, you could execute it with a pencil, paper, and basic calculator. In this post, I'm going to cover the method, it's drawbacks, and an approach called the Weyl Sequence
Middle Square Method
The algorithm is to start with an n-digit seed. The seed is squared, producing a 2n-digit result, zero-padded as necessary. The middle n-digits are then extracted from the result for the next seed. See? Simple. Let's look at an example.
Suppose my seed is 81 (2 digits). 81-squared is 6561 (4 digits). We then take the middle 2 digits out of the result, which is 56. We continue the process:
812 = 6561 562 = 3136 132 = 0169 162 = 0256 252 = 0625 622 = 3844 842 = 7056 52 = 0025 22 = 0004 02 = 0000
And we've reached the core problem with the middle square method- it has a tendency to converge, most likely to zero, but other numbers are possible, and in some cases a short loop. Of course, John von Neumann was aware of this problem, but he also preferred it that way. When the middle square method fails, it's immediately noticable. But, it's also horribly biased and fails most statistical tests for randomness.
Middle Square Weyl Sequence
A modern approach to an old problem is known as the Middle Square Weyl Sequence, from Hermann Weyl. Basically, a number is added to the square, then the middle bits are extracted from the result for the next seed. Let's first look at the C code, then I'll explain it in detail.
#includeuint64_t x = 0, w = 0 // Must be odd (least significant bit is "1"), and upper 64-bits non-zero uint64_t s = 0xb5ad4eceda1ce2a9; // qualifying seed // return 32-bit number inline static uint32_t msws() { x *= x; // square the number w += s; // the weyl sequence x += w; // apply to x return x = (x>>32) | (x<<32); // return the middle 32-bits }
Explanation
Okay. Let's dive into the code. This is a 32-bit PRNG using John von Neumann's Middle Square Method, starting with a 64-bit seed "s". As the notes say, it must be an odd number, and the upper 64-bits must be non-zero. It must be odd, to ensure that "x" can be both odd and even. Recall- an odd plus an odd equals an even, and an odd plus an even equals an odd.
Note that at the start, "x" is zero, so squaring it is also zero. But that's not a problem, because we are adding a non-zero number. During that time, the "w" variable is assigned. It's dynamically changed on every iteration, although "s" remains static.
Finally, our return is a 32-bit number (because of the "inline static uint32_t" function width), but we're doing some bit-shifting. Supposedly, this is returning the middle 32-bits of our 64-bit "x", but that's not immediatly clear. Let's look at it more closely.
Example
Suppose "x = 0xace983fe671dbd09". Then "x" is a 64-bit number with the following bits:
1010110011101001100000111111111001100111000111011011110100001001
When that number is squared, it becomes the 128-bit number 0x74ca9e5f63b6047f6a65456d9da04a51, or in binary:
01110100110010101001111001011111011000111011011000000100011111110110101001100101010001010110110110011101101000000100101001010001
But remember, "x" is a 64-bit number, so in our C code, only the bottom 64-bits are returned from that 128-bit number. So "x" is really 0x6a65456d9da04a51, or in binary:
0110101001100101010001010110110110011101101000000100101001010001
But the bits "01101010011001010100010101101101" are the 3rd 32-bits of the 128-bit number that was the result of squaring "x" (see above). They are the "middle" 32-bits that we're after. So, we're going to do something rather clever. We're going to swap the upper 32-bits with the lower, then return the lower 32-bits.
Effectively, what we're doing is "ABCD" -> "CDAB", then returning "AB". We do this via bit-shifting. So, starting with:
0110101001100101010001010110110110011101101000000100101001010001
First, we bitshift the 64-bit number right 32-bits:
0110101001100101010001010110110110011101101000000100101001010001 >> 32 = 0000000000000000000000000000000001101010011001010100010101101101
Then we bitshift "x" left 32-bits:
0110101001100101010001010110110110011101101000000100101001010001 << 32 = 1001110110100000010010100101000100000000000000000000000000000000
Now we logically "or" them together:
0000000000000000000000000000000001101010011001010100010101101101 | 1001110110100000010010100101000100000000000000000000000000000000 |================================================================ 1001110110100000010010100101000101101010011001010100010101101101
See the swap? Now, due to the function return width, we return the lower 32-bits as our random number, which is 01101010011001010100010101101101, or 1785021805 in decimal. We've arrived at our goal.
Conclusion
At the main website, the C source code is provided, along with 25,000 seeds, as well as C source code for the Big Crush randomness tests from TestU01. This approach passes Big Crush with flying colors on all 25,000 seeds. Something a simple as adding an odd 64-bit number to the square changes John von Neumann's approach so much, it becomes a notable PRNG.
Who said you can't teach an old dog new tricks?
Post a Comment