Lately, I have been studying pseudorandom number generators (PRNGs, also called "deterministic random bit generators", or DRBGs). I've been developing cryptographically secure PRNGs (CSPRNGs), and you can see my progress on Github at https://github.com/atoponce/csprng. This project is for nothing more than for me to somewhat get a feeling for new languages, while also learning a thing or two about applied cryptograhpy. However, for the subject of this post, I want to address one PRNG that is not cryptographically secure- the Lagged Fibonacci Generator.
What drew me to this generator was thinking about a way to have a PRNG to do by hand. I started thinking about different ways to construct a PRNG mathematically. But, before creating an algorithm, I needed to identify all the points that make a good PRNG. A good PRNG should have:
- An easy implementation.
- High efficiency in calculating the pseudorandom values.
- Long (practically un-observable) periods for most, if not all initial seeds.
- A uniform distribution over the finite space.
- No correlation between successive values.
I put a great deal of thought into it, but couldn't come up with anything I was very proud of. I thought of using trigonometric functions, various logarithm functions, geometric and algebraic expressions, and even fancy equations using derivatives. The more I thought about it, the further away I drifted from something simple that could be done by hand with pencil and paper.
The best I came up with, which required using a scientific calculator, was forcing the sequence to grow (a monotonically increasing function), then forcing it into a finite field with a modulus. However, no matter what I threw at it, I always struggled with either dealing with "0" or "1". For example, taking the n-th exponent of either "0" or "1" will always return a "0" or "1". I realized quickly that multiplication might be a problem. For example, one thought I had was the following:
Si = Floor[(Si-1)3/2], mod M
This works out fine, until your output is a "0" or "1", then the generator sits on either of those numbers indefinitely. I realized that my function should probably just stick with addition, or I'm bound to get myself into trouble. I thought, and thought about it, then it hit me. It was my "Beautiful Mind" moment.
I thought of the Fibonacci sequence.
The Fibonacci sequence is monotonically increasing for two seeds S1 and S2, where 0 < S1 < S2. If you put an upper bound on the sequence via a modulus, you can limit it to a finite space, and I can have my PRNG. However, I also know that the distance between any two sequential digits in the Fibonacci sequence approaches the Golden Ratio Phi. I'm not sure how this would affect my simple PRNG, and if a correlation between successive digits could be identified, but I started scribbling down numbers on a text pad anyway.
Immediately, however, I found something interesting: If both seeds are even, then the whole sequence of numbers would be even. For example, take the following Fibonacci PRNG:
S1 = 6, S2 = 8, mod 10 6 8 4 2 6 8 4 2 6 8 4 2 ...
There are two problems happening here- first, the period of the PRNG is 4 digits- 6, 8, 4, & 2. Second, because even numbers were chosen for the seeds, even numbers are the only possibility for the PRNG. So, either one of the seeds or the modulus must be odd, or the PRNG algorithm needs to be modified.
At this point, I threw my hands up in the air, and said "screw it". I decided to see what history had discovered with simple PRNGs. Turns out, I wasn't far off. A Fibonacci sequence PRNG exists called the Lagged Fibonacci Generator. Here is how it works:
Sn = Sn-j ⊙ Sn-k mod M, 0 < j < k
Where "⊙" is any binary function, such as addition, subtraction, multiplication, or even the bitwise exclusive-or.
First off, it doesn't address the "all evens" problem with my naive generator. If addition is used to calculate the values, then at least one number in the seed must be odd. If multiplication is used, then at least k-elements must be odd. However, what is interesting about this generator, is that rather than picking the first and second elements of the list to calculate the random value (Si-1 and Si-2), any j-th and k-th items in the list can be used (Si-j and Si-k). However, you must have at least k-elements in the list as your seed before beginning the algorithm.
To simplify things, lets pick "j=3" and "k=7" mod 10 addition. I need at least seven elements in the list, and at least one of them must be odd. I've always like the phone number "867-5309", so let's use that as our seed. Thus, the first 10 steps of our generator would look like this:
j=3, k=7, mod 10 addition [j] [k] 1. 8 6 [7] 5 3 0 [9] => 7+9 = 6 mod 10 2. 6 7 [5] 3 0 9 [6] => 5+6 = 1 mod 10 3. 7 5 [3] 0 9 6 [1] => 3+1 = 4 mod 10 4. 5 3 [0] 9 6 1 [4] => 0+4 = 4 mod 10 5. 3 0 [9] 6 1 4 [4] => 9+4 = 3 mod 10 6. 0 9 [6] 1 4 4 [3] => 6+3 = 9 mod 10 7. 9 6 [1] 4 4 3 [9] => 1+9 = 0 mod 10 8. 6 1 [4] 4 3 9 [0] => 4+0 = 4 mod 10 9. 1 4 [4] 3 9 0 [4] => 4+4 = 8 mod 10 10. 4 4 [3] 9 0 4 [8] => 3+8 = 1 mod 10 Generated: 6 1 4 4 3 9 0 4 8 1
The following Python code should verify our results:
1 2 3 4 5 6 7 8 9 10 11 12 | j = 3 k = 7 s = [8, 6, 7, 5, 3, 0, 9] for n in xrange(10): for i in xrange(len(s)): if i is 0: out = (s[j-1] + s[k-1]) % 10 # the pseudorandom output elif 0 < i < 6: s[i] = s[i+1] # shift the array else: s[i] = out print s[i], # print the result |
Running it verifies our results:
$ python lagged.py 6 1 4 4 3 9 0 4 8 1
It's a "lagged" generator, because "j" and "k" lag behind the generated pseudorandom value. Also, this is called a "two-tap" generator, in that you are using 2 values in the sequence to generate the pseudorandom number. However, a two-tap generator has some problems with randomness tests, such as the Birthday Spacings. Apparently, creating a "three-tap" generator addresses this problem. Such a generator would look like:
Sn = Sn-j ⊙ Sn-k ⊙ Sn-l mod M, 0 < j < k < l
Even though this generator isn't cryptographically secure (hint: it's linear), it meets the above requirements for a good PRNG, provided the "taps" are chosen carefully (the lags are exponents of a primitive polynomial), and the modulus is our traditional "power-of-2" (2M, such as 232 or 264). Supposing we are using a two-tap LFG, it would have a maximum period of:
(2k-1)*k if exclusive-or is used (2k-1)*2M-1 if addition or subtraction is used (2k-1)*2M-3 if multiplication is used (1/4 of period of the additive case)
For a good LFG, it is found that a three-tap generator should be used, as a 3-element spacing correlation can be found in two-tap generators, and that initial taps should be very high for a large modulus. Further, the full mathematical theory hasn't been worked out on Fibonacci generators, so the quality of the generators rests mostly on the statistics of the generated output, and randomness tests.
However, this is simple enough to do by hand, if nothing else than to impress your friends.
{ 2 } Comments