Image of the glider from the Game of Life by John Conway
Skip to content

Haveged - A True Random Number Generator

I admit that my last post sucked. I've been working on a few things that I want to blog about, but it's going to take time to get all my ducks in a row. So, that post was mostly "filler". Read as "I haven't blogged in a while, and should probably put something up".


However, I finally have something you can sink your teeth into: true random number generation. Consider the following scenario: you wish to generate a long-term cryptographic key. Maybe an OpenPGP key, an SSH key, an SSL key, OTR, whatever. What sucks, is some utilities rely on the Linux kernel software device /dev/random as a source of randomness to get their seed or numbers. However, /dev/random will block, or stop responding, when the system's entropy estimate is low or exhausted. So, you sit and wait, maybe moving your mouse around and mashing your keyboard, to generate more entropy for the system.

I personally keep an encrypted file with all the passwords for all the servers, web sites, and accounts that I login to. On a regular basis, I'll rotate through the passwords. So, I typically will use my Password Card as a source for these passwords. But, suppose I wanted to use a password generator instead. The "apg" tool is a good solution.

On your system, run the following:

$ for i in {1..100}; do apg -a 0 -m 12 -x 16 -n 1; done

This will generate 100 passwords of varying length between 12 and 16 characters. Further, it will use a new seed for each of the newly generated passwords. Compare this to passing the '-n 100' switch, which will use a single seed for all 100.

You'll likely notice that the generation is slow. If you walk away from your computer, this could take 15 minutes, at least, probably much longer. If you babysit the process, you could probably get it finished in a minute or two. Regardless, it sucks.

We are draining our entropy pool faster than we can keep it filled. So, we must use external, unpredictable events, such as mouse movements, to generate entropy faster. Now, there are hardware true random number generators we could deploy, such as the Entropy Key from Simtec Electronics in the United Kingdom. However, there is a software daemon we can start that can use our existing hardware as a source of true randomness.

That software is haveged. I won't go into the nitty-gritty about haveged. However, I have run this against a slew of random number tests, and it has passed with flying colors every time. And, as you can write to the /dev/random device, haveged can keep the pool filled. So, the question is, how full can the pool be?

$ cat /proc/sys/kernel/random/poolsize
$ cat /proc/sys/kernel/random/entropy_avail

So, by default, the entropy poolsize is 4096 bits, and I only have 109 bits available to the system. Not much. Haveged can fix this:

$ sudo aptitude install haveged
$ cat /proc/sys/kernel/random/entropy_avail

Much better. Now, how quickly will my previous command of generating 100 password execute? With haveged running, it took about 2 seconds. And, our entropy remains filled.

Try doing this with generating a 4096-bit RSA GnuPG keypair. Without something keeping the entropy pool filled, this can take hours. With haveged running, I can generate it in a minute flat on this wimpy netbook.

Now, if you're skeptical of haveged, and you should be, you can test it against dieharder. However, it will take a while to run. I walked away and did other things while it was running. When comming back, I see that every test passed, except one, which came back as "WEAK". So, from my perspective, haveged produced very high quality random bits.

Now, what are the benefits? Do you run a lot of SSH connections to a single server? How about doing regular backups with rsync over SSH? Maybe you have a busy VPN or HTTPS server. It's not uncommon for these real world, common scenarios to produce timeouts or hang, because the entropy pool has been exhausted, the system must wait for more before continuing. Using haveged fixes this without issue. Even better, haveged barely uses system resources. Your idle ssh connection is likely adding more to the load than haveged. And it's true random bits.

{ 10 } Comments

  1. jpope | September 15, 2012 at 9:51 am | Permalink

    Thanks for this post. My webserver had a ~160 bit average entropy according to munin. I hadn't even thought about looking to increase this but, after installing haveged, my server seems to average ~2000.
    And after a little more research, I've found that "hackers" can utilize low entropy situations by guessing some bits. A higher available entropy will help security some by making it more difficult to randomly guess bits.

  2. Marc Deslauriers | September 16, 2012 at 4:31 pm | Permalink

    AFAIK, Havege uses TSC drift as a random source of entropy, but recent CPUs now have an accurate TSC, so havege hasn't been generating random numbers for a while now. This is also an issue in certain virtualized environments.

    PolarSSL had to switch away from using Havege as the basis of their RNG because of this issue:

  3. Jason | September 16, 2012 at 5:00 pm | Permalink

    I don't know where entropy falls into what I'm about to say, but;
    I've heard for years that /dev/random's limitations are mostly overcome by /dev/urandom

    Is entropy not included in that statement?

  4. guwertz | September 17, 2012 at 11:40 am | Permalink

    havege does not use TSC drift. havege relies on variations of execution timing - not the same thing at all. Execution timing in a modern processor depends on all sort of caching: l1 data, l1 instruction, translation look-aside buffers, branch predictors. In any realistic interrupt driven system, the elapsed time for execution a series of instructions in unpredictable - anyone who has ever bench marked code will agree. A high resolution timer is needed to scale this effect down to where the benchmark can be performed quickly enough to serve as a RNG. Actually it is not quite that simple because the signal is weak and highly skewed (log normal) and some art is needed to create uniformly distributed output.

    The polar SSL link shows one of the few ways this can go wrong: a badly virtualized TSC. AFAIK, the only way to deal with this is to validate operation at run time. That is what the latest haveged now does. It uses the AIS-31 test suite used in German CC certification to verify correct operation at run time.

  5. Aaron Toponce | September 17, 2012 at 1:46 pm | Permalink

    Marc Deslaureiers- I was going to correct your comment, but guwertz has already done so. explains it fairly well. caching, interrupts, etc.

    Jason- /dev/urandom uses a CSPRNG, and it's output is indistinguishable from "true random" numbers. It too draws on its own entropy pool separate from /dev/random, but when exhausted, will turn to using SHA1 to generate the random numbers. As will all performant PRNGs, a period is associated with SHA1, and after so many have been generated, the rest can be predicted. In this case, it's a 160-bit period (which is fairly large).

  6. Marc Deslauriers | September 17, 2012 at 5:05 pm | Permalink

    @guwertz: Interesting, thanks for clearing that up.

  7. Matt Campbell | October 4, 2012 at 11:06 am | Permalink

    So why isn't haveged yet a part of every GNU/Linux distro's base installation? Is it just too obscure, or is there some kind of trade-off?

    Thanks for blogging about this.

  8. Aaron Toponce | October 6, 2012 at 8:15 am | Permalink

    Because the entropy created from the standard system stuff day-to-day, is "good enough" for most users. If you want more, then haveged is available.

  9. True Randomness | July 25, 2014 at 3:14 pm | Permalink

    In the end, “randomness” is all about the definition… just ask a cryptographer. 😉

  10. Aaron Toponce | July 27, 2014 at 7:28 pm | Permalink

    Well, "true random" turns into a philosophical debate very quickly, but what's important isn't the "random" environment numbers come from as much as when observing a sequence of numbers, you still cannot gain any additional knowledge out future generated numbers. In other words, "computationally secure" unpredictability.

    At any event, with "true random" numbers, you still end up subjecting them to a man-made algorithm. So, even if you could prove beyond a shadow of a doubt that you have "true random" numbers, as soon as you subject them to a cryptographically secure algorithm, they are only as "strong" as the algorithm that they were subjected to. The only time "true random" numbers really make any sense, is when you are using an informational theoretic algorithm, such as the One Time Pad or Shamir's Secret Sharing.

Post a Comment

Your email is never published nor shared.