Image of the glider from the Game of Life by John Conway
Skip to content

Do XKCD Passwords Work?

You'll always see comments on web forums, social sites, blog posts, and emails about "XKCD passwords". This is of course referring to the XKCD comic by Randall Munroe describing what he thinks is the best password generator:

What no one has bothered asking, is if this actually works.

Lorrie Faith Cranor, director of the Carnegie Mellon Usable Privacy and Security Laboratory at Carnegie Mellon University, a member of the Electronic Frontier Foundation Board of Directors, and Professor in the School of Computer Science and the Engineering and Public Policy Department at Carnegie Mellon University, did ask this question. In fact, she studied to the point, that she gave a TED talk on the subject. The transcript of her talk can be found here. Here are the relevant bits (emphasis mine):

Now another approach to better passwords, perhaps, is to use pass phrases instead of passwords. So this was an xkcd cartoon from a couple of years ago, and the cartoonist suggests that we should all use pass phrases, and if you look at the second row of this cartoon, you can see the cartoonist is suggesting that the pass phrase "correct horse battery staple" would be a very strong pass phrase and something really easy to remember. He says, in fact, you've already remembered it. And so we decided to do a research study to find out whether this was true or not. In fact, everybody who I talk to, who I mention I'm doing password research, they point out this cartoon. "Oh, have you seen it? That xkcd. Correct horse battery staple." So we did the research study to see what would actually happen.

So in our study, we used Mechanical Turk again, and we had the computer pick the random words in the pass phrase. Now the reason we did this is that humans are not very good at picking random words. If we asked a human to do it, they would pick things that were not very random. So we tried a few different conditions. In one condition, the computer picked from a dictionary of the very common words in the English language, and so you'd get pass phrases like "try there three come." And we looked at that, and we said, "Well, that doesn't really seem very memorable." So then we tried picking words that came from specific parts of speech, so how about noun-verb-adjective-noun. That comes up with something that's sort of sentence-like. So you can get a pass phrase like "plan builds sure power" or "end determines red drug." And these seemed a little bit more memorable, and maybe people would like those a little bit better. We wanted to compare them with passwords, and so we had the computer pick random passwords, and these were nice and short, but as you can see, they don't really look very memorable. And then we decided to try something called a pronounceable password. So here the computer picks random syllables and puts them together so you have something sort of pronounceable, like "tufritvi" and "vadasabi." That one kind of rolls off your tongue. So these were random passwords that were generated by our computer.

So what we found in this study was that, surprisingly, pass phrases were not actually all that good. People were not really better at remembering the pass phrases than these random passwords, and because the pass phrases are longer, they took longer to type and people made more errors while typing them in. So it's not really a clear win for pass phrases. Sorry, all of you xkcd fans. On the other hand, we did find that pronounceable passwords worked surprisingly well, and so we actually are doing some more research to see if we can make that approach work even better. So one of the problems with some of the studies that we've done is that because they're all done using Mechanical Turk, these are not people's real passwords. They're the passwords that they created or the computer created for them for our study. And we wanted to know whether people would actually behave the same way with their real passwords.

So, in her research, XKCD passwords really didn't work out that well. They are longer in length, so they take longer to type, which increases the chance for error, and people are no better at remembering on XKCD passphrase, than they are a short string of random characters.

To me, this is unsurprising. If you look at the history of my blogging on passwords, you'll find that I continually advocate true random events to build your passwords, maximizing entropy. In my last post, I even blogged two shell functions that you can use to build XKCD passwords, and "monkey passwords" (monkeys generating passwords by banging away at a keyboard). Both target 80-bits of entropy in the generation. Check out the lengths:

$ gen-monkey-pass 9
cxqwtw63taxdr3zn	uaq4tbt43japmm2q	mptwrxhhb486yfuv
-cb73b9-kgzhmww3	s45t3x6r9smw-7yr	hjkgzkha-qup4gh4
34c5rg4ksw-aprvk	uug-2vq7pfze6dnp	s4qx4eazbnrd2pqe

$ gen-xkcd-pass 9
sorestdanklyAlbanyluckyRamonaFowler   (sorest dankly Albany lucky Ramona Fowler)
towsscareslaudedrobinawardsrenal      (tows scares lauded robin awards renal)
thinkhazelsvealjuggedagingscareen     (think hazels veal jugged agings careen)
tarotpapawsNolanpacketAvonwiped       (tarot papaws Nolan packet Avon wiped)
surgesakimbohardercruelArjunablinds   (surges akimbo harder cruel Arjuna blinds)
amountlopsedgemeaslyCannoninseam      (amount lops edge measly Cannon inseam)
EssexIzmirwizesPattygroutszodiac      (Essex Izmir wizes Patty grouts zodiac)
hoursmailedslamsvowedallowspar        (hours mailed slams vowed allow spar)
AfghanNigelnutriadillmoldertrolly     (Afghan Nigel nutria dill molder trolly)

XKCD passwords average 32 characters to achieve 80-bits of entropy, compared to 16 characters that "monkey passwords" produce. And, according to the research done by Lorrie, people won't necessarily recall XKCD passwords any easier than "monkey passwords". So, if that's the case, then what's the point? Why bother? Why not just create "monkey passwords", and use a password manager?

Exactly. It's 2015. There are password managers for your browser, all versions of every desktop operating system, command-line based utilities for servers, and even apps for your smartphone. There are plenty of "cloud" synchronization services to make sure each instance is up-to-date. At this point, your passwords should:

  • Contain at least 80-bits of entropy.
  • Be truly random generated (no influence from you).
  • Be unique for each and every account.
  • Be protected with two-factor authentication, where available.
  • Be stored in a password manager, that is easily accessible.

You'll remember the ones you type in frequently, and you'll memorize them quickly. The others are stored for safe keeping, should you need to recall them.

Password Generation in the Shell

No doubt, some people use password generators- not many, but some. Unfortunately, this means relying on 3rd party utilities, where the source code may not always be available. Personally, I would rather be in full control of the entire generation stack. I know how to make sure plenty of entropy is available in the generation, and I know which sources of entropy to draw on to maximize the entropy estimate. As such, I don't use tools like pwgen(1), apg(1), or anything else. I rely strictly on /dev/urandom, grep(1), and other tools guaranteed to be on every BSD and GNU/Linux operating system. So, with that said, here they are:

    [[ $(echo "$1"|grep -E '[0-9]+') ]] && NUM="$1" || NUM=1
    for I in $(seq 1 "$NUM"); do
        LC_CTYPE=C strings /dev/urandom|grep -o '[a-hjkmnp-z2-9-]'|head -n 16|paste -sd '' -
    done | column
    [[ $(echo "$1"|grep -E '[0-9]+') ]] && NUM="$1" || NUM=1
    DICT=$(LC_CTYPE=C grep -E '^[a-Z]{3,6}$' /usr/share/dict/words)
    for I in $(seq 1 "$NUM"); do
        WORDS=$(echo "$DICT"|shuf -n 6|paste -sd ' ' -)
        XKCD=$(echo -n "$WORDS"|sed 's/ //g')
        echo "$XKCD ($WORDS)"|awk '{x=$1;$1="";printf "%-36s %s\n", x, $0}'
    done | column

Nothing fancy about them. The first function, "gen-monkey-pass" acts like monkeys banging on the typewriter. It reads /dev/urandom directly, reading the printable characters that come out of it, counting them to 16, and putting them in an orderly set of columns for output as seen below. The input is a total set of 32-characters, giving each character exactly 5-bits of entropy. So, at 16 characters, each password comes with exactly 80-bits of entropy. The character set was chosen to stay entirely lowercase plus digits, and remain unambiguous, so it's clear, and easy to type, even though it may still be hard to remember. The function can take a numerical argument, for generating exactly that many passwords:

$ gen-monkey-pass 24
awdq2zwwfcdgzqpm	t54zqxus77zsu6j6	-2h6dkp93bjdb496
thm9m9nusqxuewny	qmsv2vqw-4-q4b4d	ttbhpnh4n7nue5g8
ytt6asky765avkpr	grwhsfmyz872zwk3	mzq-5ytdv8zawhy6
zb46qgnt62k74xwf	uydrsh2axaz5-ymx	6knh32qj4yk885ea
vky55q2ubgaucdnh	5dhk9t97pfja9phj	rhn2qg734p83wnxs
-q2hb833c-54z-9j	t33shcc55e3kqcd6	q6fwn3396h4ygvq4
232hr73rkymerpyg	u2pq-3ytcpc79nb9	7hqqwqujz4mxa-en
jj9vdj3jtpjhwcp6	mqc97ktz-78tb2bp	q7-6jug86kqhjfxn

The second function, "gen-xkcd-pass" comes from the "correct horse battery staple" comic from XKCD. On every Unix system, there is a dictionary file installed at /usr/share/dict/words. On Debian GNU/Linux, it contains 99,171 words. However, many of them have the apostrophe as a valid character. Taking out any punctuation and digits, we are left with just lowercase and uppercase characters for our words. Further, the total word space is limited to at least 3 characters in length and at most 6 characters in length. This leaves us with 19,198 words, or about 14.229-bits of entropy per word. This means generating at least 6 words to achieve an 80-bit entropy minimum. For clarity, the password is space-separated to the right in parens, to make it more clear what exactly the password is, as shown below. Even if all 6 words have 6 characters (the password is 36 characters in total), the formatted line will never be longer than 80 characters in width, making it fit perfectly in an 80x24 terminal. It also takes a numerical argument, for generating exactly that many passwords:

$ gen-xkcd-pass 8
flyersepticspantearruinedwoo         (flyer septic span tear ruined woo)
boasgiltCurrywaivegalsAndean         (boas gilt Curry waive gals Andean)
selectpugjoggedlargeArabicbrood      (select pug jogged large Arabic brood)
titshubbubAswancartharmedtaxi        (tits hubbub Aswan cart harmed taxi)
Reaganmodestslowleessamefoster       (Reagan modest slow lees same foster)
tussleFresnoJensentheirsNohhollow    (tussle Fresno Jensen theirs Noh hollow)
Laredoriffplunkbarredhikersrearm     (Laredo riff plunk barred hikers rearm)
demostiffnukesvarlethakegilt         (demo stiff nukes varlet hake gilt)

Of course, as you can see, some fairly obscure words pop out as a result, such as "filt" and "rearm". But then, you could think of it as expanding your vocabulary. If you install the "american-insane" dictionary, then you can get about 86,851 words in your total set, bringing your per-word entropy north of 16-bits. This would allow you to cut your number of generated words down to 5 instead of 6, to keep the 80-bits entropy minimum. But then, you also see far more obscure words than with the standard dictionary, and it will take a touch longer to randomize the file.

Both of the functions should be fairly platform agnostic. If not, let me know what isn't exactly working in your shell, and why, and I'll try to address it.

Setting Up A Global VPN Proxy on Android with L2TP/IPSec PSK

In my last post in this short series, I want to discuss how to setup a transparent proxy on your Android phone using the builtin VPN for L2TP. As usual, the same precautions apply here. Don't be stupid with your data, just because you can hide it from your ISP.

In general, I'm skeptical of VPN service providers, which is partially why I'm writing this post. There isn't a VPN provider on this planet that will go to jail for you. And I don't buy into the hype that they aren't logging your traffic. Too often, VPN providers have been all too hasty to turn over user account information and logs, when Big Brother comes knocking. Instead, install strongSwan on your own L2TP VPN server, in a datacenter you trust to handle your traffic, and configure your Android to use that.

Unlike the previous posts, this one does not require root access. To start, you need to navigate to "Settings -> More -> VPN":

vpn-0 vpn-1

Tap the "+" sign to add a new VPN configuration. In this example, we'll configure it to connect to an L2TP/IPSec PSK VPN. As such, you'll need to fill out the server address (pixelated here), and the IPSec pre-shared key. Give the configuration a name, such as "My VPN", and tap "SAVE".

vpn-2 vpn-3 vpn-4

When tapping on the "My VPN" defined configuration, you will be asked to authenticate with your credentials. These can be from the operating system accounting database, LDAP, NIS, or IPSec specific. Provide your username and password, and tap "Save account information" if you want to save the credentials to disk on the phone. Then tap "CONNECT". At this point, you should see a little key in the status bar, confirming that you are indeed connected to the VPN server. If you want, you can create a "VPN" quick-access widget on your home screen, so you can get immediate access to your "My VPN" configuration with a single tap.

vpn-5 vpn-6 vpn-7

Setting Up A Global Tor Proxy on Android with Orbot

In my last post, I explained how to setup a Global SSH proxy on Android with ConnectBot and ProxyDroid. In this article, I'll do the same thing, but with Orbot. Also, as with the last article, the same precautions apply here. If you're on an untrusted or unknown network, using an encrypted proxy can be helpful. However, just because you're using Tor, doesn't mean you should trust its network blindly either. There are all sorts of practical attacks on Tor that have been reaching the press lately, and you would be wise to read them, and proceed with caution.

With that said, sometimes all you want to do is get around a content filter, such as viewing Reddit at church, or getting on Twitter while at work. Of course, there are necessary risks with those actions as well. Basically, don't be an idiot.

With that out of the way, this requires that you have root access on your phone, and that you have installed the Orbot Android app. Once the app is installed, we really only need to make one adjustment, and that is enabling two check boxes: "Transparent Proxying" and "Tor Everything":


As something you should keep in mind, you may also want to check "Use Bridges". Relay bridges are entry nodes that are not listed in the main Tor directory. As such, it is more difficult for ISPs to filter them. If you suspect that your ISP is blocking all known entry nodes, then using bridges can be helpful to get around the problem. But, using bridges may be unnecessary. Check if your Tor connection is getting filtered first. If so, enable the use of bridges, otherwise, you're just fine using Tor without them.

Also, Orbot has some interesting settings, such as specifically setting a whitelist of entry and exit nodes, and a black list of nodes to avoid. If you know someone is operating a Tor node, and you trust them, then I would recommend setting them as either an entry or exit, whichever is appropriate. The reason for this, is it is not impractical for a well-funded organization to have a large number of entry and exit nodes. If so, they can build traffic profiles on who is connecting to the entry node, and which site they are visiting from the exit. However, by specifying specific nodes for either entry or exit (or both), you eliminate this threat. Sadly enough, I could not get this working with Orbot.

One last setting that has caught my eye, is "Tor Tethering". If you use your phone as a wireless hotspot, or USB tethering, you can also transparently route all the traffic from those connected clients through the Tor proxy. I haven't tested this yet with the latest version, but with previous versions of Orbot, it didn't work.

Other settings are listed below, page after page.

orbot-1 orbot-3 orbot-4 orbot-5 orbot-6

When at the main page of the app, long-tap the power button in the center of the droid, to connect to the Tor network. When the arms of the droid are down, you are not connected. When the arms are yellow, and pointing to the sides of the phone, the app is trying to get a connecting to the Tor network. When the arms are green, pointing up, you are fully connected, and can start enjoying your proxy.

orbot-0 orbot-7 orbot-8

Notice that when you are connected, an onion icon is in the status bar at the top of the phone, showing as a permanent notification. If you have "Expanded Notifications" set, you can get IP address and country information in the notification. If you swipe the droid right or left, the droid will spin, and you will end up with a new "Tor Identity". Basically, you'll be connected to a new set of nodes.

orbot-9 orbot-10 orbot-11

Tapping the "CHECK BROWSER" button at the bottom left of the landing screen will use your default browser app to connect to and verify whether or not transparent proxying over Tor is working.

Setting Up A Global SSH Proxy on Android with ConnectBot and ProxyDroid

I'm one that takes precautions with my data when on unfamiliar or untrusted networks. While for the most part, I trust TLS to handle my data securely, I find that it doesn't take much effort to setup a transparent proxy on my Android handset, to route all packets through an encrypted proxy.

In this case, I happen to work for the greatest ISP in the world, and so I have an SSH server in the datacenter. I wholly trust the network from my SSH server to the border routers, so the more traffic I can send that direction, the better. I realize that may not be the case for all of you. However, if you have an externally available SSH server on a trusted network, this post may be of interest.

First, setting up this proxy requires having root. I'm not going to cover how to get root in this post. You can find it elsewhere. Next, you'll need to apps installed; namely ConnectBot and ProxyDroid. Both are Free Software apps. Also, you can do this with SSH Tunnel on its own, if you have Android 4.2.2 or older. Unfortunately, it doesn't work for 4.3 and newer. I have Android 5.1, and it isn't setting up the firewall rules correctly.

Once they are installed, you'll want to set them up. Here I walk through setting up ConnectBot.

  1. Pull up ConnectBot from your app drawer, and setup a new connection by typing in the username, host, and optionally port.
  2. When asked if you want to accept the server's public SSH key, verify the key, then tap "YES"
  3. Enter in your password to connect, and verify that you can successfully connect to the remote SSH server.
  4. Now, disconnect, sending you back to the app's landing screen.

connectbot-1 connectbot-2 connectbot-3

  1. At this point, long-tap the SSH profile you just created, and tap "Edit port forwards".
  2. Tap the menu in the upper-right hand corner of the profile, and tap "Add port forward".
  3. Give the forward a nickname, such as "ProxyDroid".
  4. Tap "Dynamic (SOCKS)" from the list under "Type".
  5. Provide any source port. It must be above 1024, and cannot be currently in use. I find "1984" apropos.
  6. Leave the "Destination" blank, and tap "CREATE PORT FORWARD".

connectbot-4 connectbot-5 connectbot-6 connectbot-7

You now have sucessfully created a SOCKS listening port on localhost:1984. Now, we need to create software firewall rules in the phone, to globally forward all packets through localhost on port 1984, creating our transparent proxy. As such, pull up ProxyDroid, and I'll walk you through setting that up:

  1. In ProxyDroid, set "" as the "Host".
  2. Match the port with what you set in ConnectBot's port forward ("1984" in our example).
  3. Set the "Proxy Type" to "SOCKS5"
  4. Scroll to the bottom of the app, and check the checkbox for "Global Proxy".
  5. OPTIONAL: Check the checkbox for "DNS Proxy".

That last step will tunnel DNS requests through the proxy also. Unfortunately, I have found it to be buggy, and unstable. So, leaving it unchecked, unfortunately, gives you a stable encrypted SSH proxy experience.


Now that both are configured, connect to your remote SSH server with ConnectBot that you have configured, then enable the proxy by tapping the slider next to "Proxy Switch". You should have a running global SSH proxy from your smartphone to the remote SSH server, where all packets are being sent. You can visit a site that returns your external IP address, such as, to verify that the source IP address of the HTTP request is the same IP address as your SSH server. If so, your packets are being tunneled through your SSH connection.


md5crypt() Explained

Recently, the Password Hashing Competition announced its winner, namely Argon2, as the future of password hashing. It's long since been agreed that using generic-purpose cryptographic hashing algorithms for passwords is not a best practice. This is due to their speed. Cryptographic hashing algorithms are designed to be lighting fast, while also maintaining large margins of security. However, Poul-Henning Kamp noticed in the early 1990s that the DES-based crypt() function was no longer providing the necessary margins of security for hashing passwords. He noticed how fast crypt() had become, and that greatly bothered him. Even worse, was the realization that FPGAs could make practical attacks against crypt() in practical time. As he was the FreeBSD release engineer, this meant putting something together that was intentionally slow, but also with safe security margins. He chose MD5 as the basis for his new "md5crypt password scrambler", as he called it.

Before delving into the algorithm, the first thing you'll notice is the strange number of steps and mixing that PHK does with his md5crypt() algorithm. When I was reading the algorithm, the first question that popped into my mind was: "Why not just do standard key-stretching with the password?" Something like this (pseudocode):

digest = md5(password + salt).digest()
rounds = 1000
while rounds > 0:
  digest = md5(password + salt + digest).digest()
  counter -= 1

This certainly seems to be the most straightforward approach, and the entirety of the security is based on the cryptographic security of MD5. If you were concerned about the output digest being recognizable, it might make sense to scramble it. You could scramble the remaining bytes in a deterministic fashion, which PHK actually ends up doing before saving to disk.

But then it hit me: PHK wanted his new algorithm to be intentionally slow, even if using MD5. This means adding additional steps to mixing the password, which requires more CPU, and thus, more time. If raw MD5 could process 1,000,000 hashes per second, then standard key-stretching of 1,000 iterations would bring it down to 1,000 hashes per second. However, if adding additional operations slows it down by 1/N-iterations, the the resulting throughput would be 1,000/N hashes per second. I can see it now- anything to slow down the process, without overburdening the server, is a gain. As such, the md5crypt() function was born.

Here is the algorithm, including what I think may be a bug:

  1. Set some constants:
    "pw" = user-supplied password.
    "pwlen" = length of "pw".
    "salt" = system-generated random salt, 8-characters, from [./0-9A-Za-z].
    "magic" = the string "$1$".
    "itoa64" = is our custom base64 string "./0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
  2. Initialize digest "a", and add the password, magic, and salt strings to it:
    da = MD5.init()
  3. Initialize digest "b", and add the password, salt, and password strings to it:
    db = MD5.init()
    final = db.digest()
  4. Update digest "a" by repeating digest "b", providing "pwlen" bytes:
    for(pwlen; pwlen > 0; pwlen -= 16):
      if(pwlen > 16):
  5. Clear virtual memory
    memset(final, 0, length(final))
  6. Update digest "a" by adding a character at a time from either digest "final" or from "pw" based on each bit from "pwlen":
    for(i = pwlen; i; i >>= 1):
      if i % 2 == 1:
    dc = da.digest()
  7. Iterate 1,000 times to prevent brute force password cracking from going to fast. Mix the MD5 digest while iterating:
    for(i=0; i<1000; i++)
      tmp = MD5.init()
      if i % 2 == 0:
      if i % 3 == 0:
      if i % 7 == 0:
      if i % 2 == 0:
      dc = tmp.digest()
  8. Convert 3 8-bit words of digest "c" into 4 6-bit words:
    final = ''
    for a, b, c in ((0, 6, 12), (1, 7, 13), (2, 8, 14), (3, 9, 15), (4, 10, 5)):
      v = ord(dc[a]) < < 16 | ord(dc[b]) << 8 | ord(dc[c])
      for i in range(4):
        final += itoa64[v & 0x3f]
        v >>= 6
    v = ord(dc[11])
    for i in range(2):
      final += itoa64[v & 0x3f]
      v >>= 6
  9. Clear virtual memory:
    memset(dc, 0, length(dc))

Notice that between steps 5 and 6, the virtual memory is cleared, leaving the digest "final" as NULLs. Yet, in step 6, the for-loop attempts to address the first byte of digest "final". It seems clear that PHK introduced a bug in this algorithm, that was never fixed. As such, every implementation must add a C NULL in step 6, instead of final[0]. Otherwise, you will end up with a different output than the original source code by PHK.

Anyway, that's the algorithm behind md5crypt(). Here's a simple Python implementation that creates valid md5crypt() hashes:

from hashlib import md5

# $ mkpasswd --method='md5' --salt='2Z4e3j5f' --rounds=1000 --stdin 'toomanysecrets'
# $1$2Z4e3j5f$sKZptx/P5xzhQZ821BRFX1

pw = "toomanysecrets"
salt = "2Z4e3j5f"

magic = "$1$"
pwlen = len(pw)
itoa64 = "./0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"

# Start digest "a"
da = md5(pw + magic + salt)

# Create digest "b"
db = md5(pw + salt + pw).digest()

# Update digest "a" by repeating digest "b", providing "pwlen" bytes:
i = pwlen
while i > 0:
    da.update(db if i > 16 else db[:i])
    i -= 16

# Upate digest "a" by adding either a NULL or the first char from "pw"
i = pwlen
while i:
    da.update(chr(0) if i & 1 else pw[0])
    i >>= 1
dc = da.digest()

# iterate 1000 times to slow down brute force cracking
for i in xrange(1000):
    tmp = md5(pw if i & 1 else dc)
    if i % 3: tmp.update(salt)
    if i % 7: tmp.update(pw)
    tmp.update(dc if i & 1 else pw)
    dc = tmp.digest()

# convert 3 8-bit words to 4 6-bit words
final = ''
for x, y, z in ((0, 6, 12), (1, 7, 13), (2, 8, 14), (3, 9, 15), (4, 10, 5)):
    # wordpress bug: < <
    v = ord(dc[x]) << 16 | ord(dc[y]) << 8 | ord(dc[z])
    for i in range(4):
        final += itoa64[v & 0x3f]
        v >>= 6
v = ord(dc[11])
for i in range(2):
    final += itoa64[v & 0x3f]
    v >>= 6

# output the result
print "{0}${1}${2}".format(magic, salt, final)

Ulrich Drepper created a "sha256crypt()" as well as "sha512crypt()" function, which is very similar in design, and which I'll blog about later.

It's important to note, that while PHK may have announced md5crypt() as insecure, it's not for the reasons you think. Yes, MD5 is broken, horribly, horribly broken. However, these breaks only deal with the compression function and blind collision attacks. MD5 is not broken with preimage or second preimage collisions. In the case of a stored md5crypt() hash, it requires either a brute force search or a preimage attack to find the plaintext that produced the hash. MD5 is secure with preimage attacks. The reason md5crypt() has been deemed as "insecure", is because MD5 is fast, fast, fast. Instead, password hashing should be slow, slow, slow, and no amount of creativity with MD5 can adequately address its performance. As such, you should migrate to a password hashing solution designed specifically to slow attackers, such as bcrypt or scrypt, with appropriate parameters for security margins.

The Chaocipher With Playing Cards

As you know, I am a cryptography hobbyist. More specifically, I have an interest in pencil and paper ciphers, also referred to as "hand ciphers" or "field ciphers". Since Bruce Schneier released his Solitaire Cipher for Neal Stephenson's book "Cryptonomicon" (known in the book as "Pontifex"), I have had a real desire to learn hand ciphers with playing cards, that I'll refer to as "card ciphers".

Further, in 1918, John F. Byrne invented a mechanical encryption system that he called "Chaocipher". He released an autobiography titled "The Silent Years", of which he describes the system without the algorithm, and releases a series of exhibits of ciphertexts for cryptography experts to break.

Unfortunately, because he didn't release the algorithm I think, no one took his encryption system seriously, despite his best efforts to get the War Department to use it. It wasn't until 2010 that the widow of John F. Byrne's son released the Chaocipher papers, mechanics, and artifacts to the National Cryptologic Museum in Maryland, that we finally fully understood the algorithm.

In this post, I am going to describe the algorithm using playing cards, whereas John F. Byrne's original invention required two circular rotating disks. Another hindering aspect to Byrne's invention was the mechanical engineering required. At best, the device is clunky and awkward to use. However, using playing cards, I think you'll find the system much more elegant, easier to carry around, and has the advantage of not being incriminating that you are carrying a cryptographic device. Playing cards were certainly available in the early 1900s, so it's unfortunate that he didn't think of using playing cards as the primary mechanism for the cipher.


The Chaocipher uses the concept of lookup tables to encrypt or decrypt a message. This is done by maintaining two separate alphabets. When encrypting a message, a character is located in the plaintext alphabet, and it's location is noted. Then, the ciphertext character is identified by locating the character in the ciphertext alphabet at the same position. After the ciphertext character has been recorded, both the plaintext and ciphertext alphabets are permuted. We'll get into those details in a moment, but first, let's set aside some definitions.

Because John F. Byrne's original invention required two circular disks, there are two definitions that you should be aware of:

The top of the wheel or circle. In our case, this will be the top of the pile.

The bottom of the wheel or circle. In our case, this will be in the middle of the pile (the 14th of 26 cards).

All 52 cards in a standard poker deck.
Either the red playing cards (Diamonds and Hearts) dedicated to the ciphertext alphabet, or the black playing cards (Clubs and Spades) dedicated to the plaintext alphabet. Each pile is exactly 26 cards.

Left alphabet
The red pile of playing cards containing the ciphertext characters A-Z.
Right alphabet
The black pile of playing cards containing the plaintext characters A-Z.

We will be treating our two piles (the red and black piles) as circular. The piles will always be face-up on the table and in our hands. The top card in the face-up pile will be the 1st card while the bottom card will be the 26th card. Because the pile is circular in nature, this means that the top and bottom cards in the pile are "next" to each other in succession. This means further, then, that the 14th card in the pile is our nadir, while the top card, or the 1st card in the pile, is our zenith.

Now that we've set that aside, we need to create some definitions so we know exactly which playing card in every suit is assigned to which English alphabet character. I've assigned them as follows:

Hearts and Spades Clubs and Diamonds
A 2 3 4 5 6 7 8 9 10 J Q K A 2 3 4 5 6 7 8 9 10 J Q K

This means that the English character "X" would be the Jack of Clubs in the plaintext "black" pile ("right alphabet" in Chaocipher-speak), and the Jack of Diamonds in the ciphertext "red" pile ("left alphabet" in Chaocipher-speak). This also means that the 8 of Spades would be the English character "H", just as much as the 8 of Hearts.

On a side note, if you wish to program this in software, and you populate an array with the values of 1 through 52 to represent each card in the deck, it's standard to use 1-13 for the Clubs, 14-26 for the Diamonds, 27-39 for the Hearts, and 40-52 for the Spades ("bridge order").


The algorithm can comprise of a standard "simple" set of steps, or using a more advanced "takeoff pattern" for enciphering and deciphering the message. First, let me discuss the simple pattern, then I'll look at the more advanced takeoff pattern.

An overview of the algorithm could be described as follows:

  1. Determine the ciphertext character according to the plaintext character (and vice versa for decryption).
  2. Permute the red pile.
  3. Permute the black pile.

Each of these three steps are executed in order until the message is exhausted.

Enciphering Plaintext

Looking at it in closer detail, suppose I had the following red and black piles (using "+" to identify the zenith, and "*" to identify the nadir):

            +                                      *
  red (ct): 7D 3H TH 2H JD AD 8H 8D 5H TD QH 9H JH 2D 6D KH QD 9D 5D KD AH 7H 6H 4H 4D 3D
black (pt): TC 3S JS 2C 5S AC 4C KC 9S TS 9C 6S 7S 8S QS QC 7C JC 4S 3C 8C AS 2S 5C KS 6C

If I wanted to encrypt the character "A", in the black deck, according to our table above, that would be the Ace of Spades. As such, I need to find the Ace of Spades in my black pile. While locating the card, I need to be counting, so I know the position in the pile that the Ace of Spades in in. In this case, the Ace of Spades is the 22nd card in the black pile. Thus, the 22nd card in the red pile is the Seven of Hearts:

            +                                      *                       ↓
  red (ct): 7D 3H TH 2H JD AD 8H 8D 5H TD QH 9H JH 2D 6D KH QD 9D 5D KD AH 7H 6H 4H 4D 3D
black (pt): TC 3S JS 2C 5S AC 4C KC 9S TS 9C 6S 7S 8S QS QC 7C JC 4S 3C 8C AS 2S 5C KS 6C

The Seven of Hearts produces the English character "G". Thus, with these two piles, "A" encrypts to "G" before permutation. Conversely, "G" would decrypt to "A" with these starting piles.

Permuting the Red and Black Piles

Now that we've discovered our plaintext and ciphertext characters, we need to cut the deck, such that both the plaintext and ciphertext characters are at the zenith of each pile. The resulting piles would then be as follows:

            +                                      *
  red (ct): 7H 6H 4H 4D 3D 7D 3H TH 2H JD AD 8H 8D 5H TD QH 9H JH 2D 6D KH QD 9D 5D KD AH
black (pt): AS 2S 5C KS 6C TC 3S JS 2C 5S AC 4C KC 9S TS 9C 6S 7S 8S QS QC 7C JC 4S 3C 8C

Permuting the Red Pile

Permuting the red pile follows the following steps:

  1. Remove the zenith + 1 card (2nd card) from the red pile.
  2. Place the removed card into the nadir of the red pile (will be the 14th card).

So, we'll follow these steps by taking the zenith + 1 card (2nd card), which is the "6H", and placing it at the nadir of the red pile (14th card). The resulting red pile will look as follows:

            +                                      *
  red (ct): 7H .. 4H 4D 3D 7D 3H TH 2H JD AD 8H 8D 5H TD QH 9H JH 2D 6D KH QD 9D 5D KD AH

            +                                      *
  red (ct): 7H 4H 4D 3D 7D 3H TH 2H JD AD 8H 8D 5H .. TD QH 9H JH 2D 6D KH QD 9D 5D KD AH

            +                                      *
  red (ct): 7H 4H 4D 3D 7D 3H TH 2H JD AD 8H 8D 5H 6H TD QH 9H JH 2D 6D KH QD 9D 5D KD AH

Permuting the Black Pile

Permuting the black pile follows the following steps:

  1. Take the zenith (top) card and place it at the bottom of the black pile.
  2. Remove the zenith + 2 card (3rd card) from the black pile.
  3. Place the removed card into the nadir of the black pile (will be the 14th card).

So, we'll follow these steps by taking the zenith card (top card), which is the "AS", and placing it at the bottom of the black pile. The resulting black pile will look as follows:

            +                                      *
black (pt): 2S 5C KS 6C TC 3S JS 2C 5S AC 4C KC 9S TS 9C 6S 7S 8S QS QC 7C JC 4S 3C 8C AS

Now take the zenith + 2 (3rd card), which is the "KS" and place it at the nadir of the black pile (14th card). The final black pile will look as follows:

            +                                      *
black (pt): 2S 5C .. 6C TC 3S JS 2C 5S AC 4C KC 9S TS 9C 6S 7S 8S QS QC 7C JC 4S 3C 8C AS

            +                                      *
black (pt): 2S 5C 6C TC 3S JS 2C 5S AC 4C KC 9S TS .. 9C 6S 7S 8S QS QC 7C JC 4S 3C 8C AS

            +                                      *
black (pt): 2S 5C 6C TC 3S JS 2C 5S AC 4C KC 9S TS KS 9C 6S 7S 8S QS QC 7C JC 4S 3C 8C AS

As such, both the red and black piles should look like the following after enciphering the plaintext character "A" and permuting both piles:

            +                                      *
  red (ct): 7H 4H 4D 3D 7D 3H TH 2H JD AD 8H 8D 5H 6H TD QH 9H JH 2D 6D KH QD 9D 5D KD AH
black (pt): 2S 5C 6C TC 3S JS 2C 5S AC 4C KC 9S TS KS 9C 6S 7S 8S QS QC 7C JC 4S 3C 8C AS

To summarize, the algorithm steps are as follows:

  1. Find the plaintext character in the black pile.
  2. Record the position of this card in the black pile.
  3. Find the ciphertext character in the red pile by counting to that position.
  4. Bring the plaintext character to the zenith by cutting the deck at that position.
  5. Bring the ciphertext character to the zenith by cutting the deck at that position.
  6. Permute the red pile:
    1. Remove the zenith + 1 card from the red pile (2nd card).
    2. Insert the removed card into the nadir of the red pile (14th location).
  7. Permute the black pile:
    1. Move the card at the zenith to the bottom of the black pile.
    2. Remove the zenith + 2 card from the black pile (3rd card).
    3. Insert the removed card into the nadir of the black pile (14th location).

Make sure you understand these steps before continuing.

Permuting with a Takeoff Pattern

John F. Byrne described a "takeoff pattern" in which the left and right alphabets are used for both the plaintext and ciphertext characters. In the simple method, the right alphabet (black pile) is used exclusively for all plaintext characters in the message. So, if the plaintext message was "ATTACKATDAWN", then you could think of using the right pile 12 times, or "RRRRRRRRRRRR" ("BBBBBBBBBBBB" if we're thinking "black pile").

However, suppose you would like to use both of the red and black piles (left and right alphabets respectively) for your plaintext message. Then you could create a "takeoff pattern" for encrypting your text. Suppose you used the following takeoff pattern: "RLRRLLRRRLLL" (right, left, right, right, left, left, right, right, right, left, left, left). This means that you would use the right alphabet for the first plaintext character, then the left alphabet for the second plaintext character, the right alphabet for the 3rd, the right alphabet for the 4th, etc. Or, if using playing cards, you could think of the same takeoff pattern as "BRBBRRBBBRRR" (black, red, black, black, red, red, black, black, black, red, red, red).

Personally, I don't care for the takeoff pattern for two main reasons: first, the takeoff pattern needs to be communicated with the key. This may not be a problem if code books are distributed among field agents, as the takeoff pattern can be printed on the same page as the key. However, this does mean that the takeoff pattern needs to be as long as the key itself.

The second reason I don't care for the take of pattern, is due to the unnecessary complexity of the takeoff pattern itself, it greatly increases the chances to make a mistake. Already, the sender and recipient will be going back and forth frequently between the red and black pile of cards. By creating a takeoff pattern, this makes that back and forth more frequent. Further, if you are using the 3rd of 5 "L"s in a stream, but you think you are on the 4th "L", then the encryption or decryption will be wrong from there out. Chaocipher doesn't have the ability to correct itself from a mistake.

For these two reasons, I suggest that when using playing cards with the Chaocipher, that instead you always use the black pile for the plaintext characters, and the red pile for the ciphertext characters. Then, the only thing that you need to keep track of is the characters in the message itself.

Keying the Deck

Before executing the Chaocipher algorithm, the deck should be "keyed". This refers to the order of the deck. Both the sender and the recipient must have the same deck order in order to successfully encrypt and decrypt a message. The deck can be keyed by either a sufficient set of shuffling and cutting, or keyed with a key phrase. First, let's look at thoroughly shuffling and cutting a full 52-card deck.

Keying with Shuffling and Cutting

Suppose after thoroughly shuffling and cutting the deck, the deck order face-up is as follows:

|< - top                                                                                                                                         bottom ->|
3H 8D QH 4C 6S QS 8C 4D 9S 5D 8S QC 3C 6H JS 7H 5S TS QD 7C 4H JC KD TH 3S KS 6D 9C 9D 2C JD 2H 2D 6C 8H KC 9H JH 7S KH AS AH 5C AD TC 7D 4S 3D 2S TD 5H AC

We now need a deterministic algorithm for separating the red cards from the black cards. Holding the deck face-up in your hand, deal out two face-down piles, the left pile of red cards, and the right pile of black cards. Do this card-for-card, one-at-a-time. Do not grab a bunch of similarly-colored cards. This can introduce error into the keying process. Doing it one-at-a-time ensures exactness, and minimizes the chances for mistake.

After the full deck has been dealt into two face-down piles, turn the piles over, so they are face-up. Using the standard Chaocipher tokens of "+" to identify the zenith, or top of the pile, and the "*" to identify the nadir, or 14th card in the pile, your two piles should be in the following order:

            +                                      *
  red (ct): 3H 8D QH 4D 5D 6H 7H QD 4H KD TH 6D 9D JD 2H 2D 8H 9H JH KH AH AD 7D 3D TD 5H
black (pt): 4C 6S QS 8C 9S 8S QC 3C JS 5S TS 7C JC 3S KS 9C 2C 6C KC 7S AS 5C TC 4S 2S AC
  position:  1  2  3  4  5  6  7  8  9  1  1  1  1  1  1  1  1  1  1  2  2  2  2  2  2  2
                                        0  1  2  3  4  5  6  7  8  9  0  1  2  3  4  5  6

Verify that you can do this by hand, and that it matches with the deck order above. Remember, the red pile is our "left alphabet" in the Chaocipher which contains all ciphertext English characters. The black pile is our "right alphabet" in the Chaocipher which contains all plaintext English characters. In other words, if we converted them to English characters, then the left and right alphabets would be as follows, using the same notation to identify the zenith and nadir:

            +                         *
 left (ct): C U L Q R F G Y D Z J S V X B O H I K M A N T P W E
right (pt): Q F L U I H Y P K E J T X C M V O S Z G A R W D B N
  position: 1 2 3 4 5 6 7 8 9 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2
                              0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6

Keying with a Key Phrase

Already knowing the algorithm prepares you for using a key phrase to key the deck. Basically, you'll just use the characters in your key phrase as the plaintext message, using the black pile to find key key phrase character, just as you would if encrypting a message. Both piles will be permuted, as normal. The only difference is that you will be not be recording the ciphertext characters. Further, you will start with alphabetized piles.

Both piles will start with the following order:

            +                                      *
 left (ct): AH 2H 3H 4H 5H 6H 7H 8H 9H TH JH QH KH AD 2D 3D 4D 5D 6D 7D 8D 9D TD JD QD KD
right (pt): AS 2S 3S 4S 5S 6S 7S 8S 9S TS JS QS KS AC 2C 3C 4C 5C 6C 7C 8C 9C TC JC QC KC

Suppose our key phrase is "CHAOCIPHER". Then, working through the steps character for character, they would follow the following order:

Locate "C" in the black pile:
            +                                      *
 left (ct): AH 2H 3H 4H 5H 6H 7H 8H 9H TH JH QH KH AD 2D 3D 4D 5D 6D 7D 8D 9D TD JD QD KD
right (pt): AS 2S 3S 4S 5S 6S 7S 8S 9S TS JS QS KS AC 2C 3C 4C 5C 6C 7C 8C 9C TC JC QC KC

Bring both characters to zenith:
            +                                      *
 left (ct): 3H 4H 5H 6H 7H 8H 9H TH JH QH KH AD 2D 3D 4D 5D 6D 7D 8D 9D TD JD QD KD AH 2H
right (pt): 3S 4S 5S 6S 7S 8S 9S TS JS QS KS AC 2C 3C 4C 5C 6C 7C 8C 9C TC JC QC KC AS 2S

Permute the red pile. Remove the zenith + 1 card:
            +                                      *
 left (ct): 3H .. 5H 6H 7H 8H 9H TH JH QH KH AD 2D 3D 4D 5D 6D 7D 8D 9D TD JD QD KD AH 2H
right (pt): 3S 4S 5S 6S 7S 8S 9S TS JS QS KS AC 2C 3C 4C 5C 6C 7C 8C 9C TC JC QC KC AS 2S

            +                                      *
 left (ct): 3H 5H 6H 7H 8H 9H TH JH QH KH AD 2D 3D .. 4D 5D 6D 7D 8D 9D TD JD QD KD AH 2H
right (pt): 3S 4S 5S 6S 7S 8S 9S TS JS QS KS AC 2C 3C 4C 5C 6C 7C 8C 9C TC JC QC KC AS 2S

Insert the card into the nadir:
            +                                      *
 left (ct): 3H 5H 6H 7H 8H 9H TH JH QH KH AD 2D 3D 4H 4D 5D 6D 7D 8D 9D TD JD QD KD AH 2H
right (pt): 3S 4S 5S 6S 7S 8S 9S TS JS QS KS AC 2C 3C 4C 5C 6C 7C 8C 9C TC JC QC KC AS 2S

Permute the black pile. Move the top card to the bottom:
            +                                      *
 left (ct): 3H 5H 6H 7H 8H 9H TH JH QH KH AD 2D 3D 4H 4D 5D 6D 7D 8D 9D TD JD QD KD AH 2H
right (pt): 4S 5S 6S 7S 8S 9S TS JS QS KS AC 2C 3C 4C 5C 6C 7C 8C 9C TC JC QC KC AS 2S 3S

Remove the zenith + 2 card:
            +                                      *
 left (ct): 3H 5H 6H 7H 8H 9H TH JH QH KH AD 2D 3D 4H 4D 5D 6D 7D 8D 9D TD JD QD KD AH 2H
right (pt): 4S 5S .. 7S 8S 9S TS JS QS KS AC 2C 3C 4C 5C 6C 7C 8C 9C TC JC QC KC AS 2S 3S

            +                                      *
 left (ct): 3H 5H 6H 7H 8H 9H TH JH QH KH AD 2D 3D 4H 4D 5D 6D 7D 8D 9D TD JD QD KD AH 2H
right (pt): 4S 5S 7S 8S 9S TS JS QS KS AC 2C 3C 4C .. 5C 6C 7C 8C 9C TC JC QC KC AS 2S 3S

Insert the card into the nadir:
            +                                      *
 left (ct): 3H 5H 6H 7H 8H 9H TH JH QH KH AD 2D 3D 4H 4D 5D 6D 7D 8D 9D TD JD QD KD AH 2H
right (pt): 4S 5S 7S 8S 9S TS JS QS KS AC 2C 3C 4C 6S 5C 6C 7C 8C 9C TC JC QC KC AS 2S 3S

Repeat for "H", "A", "O", "C", "I", "P", "H", "E", & "R".

When you are finished keying the deck with the key phrase "CHAOCIPHER", you should have the following order for the red and black piles:

            +                                      *
 left (ct): 6D JH 7D 8D 9D 2D KH JD QD 2H 3H 6H 7H 8H 9H TH QH AD KD AH TD 3D 4H 5H 4D 5D
right (pt): 4S AS 3S JS 5S 7S 9S QS AC 2C 3C 4C 6S 2S 6C 8S 7C 8C KS TS 9C TC JC QC KC 5C


Initialization Vectors

One thing that we have learned with modern computer encryption primitives is to prepend initialization vectors to the ciphertext. The initialization vector must be random and unpredictable. However, its function is to create a unique state on the system before the plaintext is encrypted or before the ciphertext is decrypted. The point is to modify a secret state (our key, or pile orders) while ignoring the output. By adding an initialization vector to the system, we limit the effectiveness of attacks on the ciphertext. For example, if the initialization vector is 26 characters long (one character for each character in the English alphabet, or 26! total combinations), then 25 collisions on one initialization vector to launch an attack on the state (the last element can be determined by process of elimination).

Unfortunately, a 26-character initialization vector is not very practical to use by hand. Knowing that it is standard for field agents to break up their messages into blocks of five characters, it would seems reasonable to use a 5-character initialization vector. However, this doesn't seem to mix the state well enough.

For example, consider using an unkeyed deck to encrypt the text "AARON" 10 times with different initialization vectors at each round:


The first five characters in each ciphertext is the initialization vector, randomly generated. The second block of 5 characters is my name encrypted after the initialization vector keyed the deck. Notice that the first character in the second block seems to have a lot of "V"s and "W"s. If I do 100 rounds, and count the frequency of the first character, I get the following:

F:1, L:1, G:3, K:3, T:4, I:7, J:7, U:7, H:8, W:14, V:45

That is not a good distribution of characters for the first plaintext character being an "A" over 100 different initialization vectors. I would expect it to be much more diffuse. So, how about instead of a 5-character initialization vector, we bump it to 10? How does the frequency distribution look then?

A:1, I:1, U:1, V:1, X:1, Z:1, T:2, E:3, G:3, N:5, O:5, Q:5, B:6, C:6, F:7, D:10, S:11, R:13, P:18

That's a little bit better. A 26-character initialization vector would certainly show a flatter frequency distribution for the first ciphertext character in the message. However, as mentioned, that's cumbersome. So, at this point, it's up to you. Using a 5-character initialization vector would provide about 10 or 11 possible first ciphertext characters. Using a 10-character initialization vector increases that to about 18 with a flatter distribution.

PKCS#7 Padding

As mentioned, it has become a field cipher standard to separate your ciphertext into blocks of 5 characters. This means that if your message is not a multiple of 5 characters, to add padding at the end until it is. However, when the recipient decrypts the message, it should be unambiguous exactly what is padding, and what is not. The padding in PKCS#7 makes this possible.

We can define this easily enough be determining exactly how many characters must be added to pad the message into multiples of 5 characters. So, we'll count:

  • If the message needs only one character appended, append a single "V".
  • If the message needs two characters appended, append "WW".
  • If the message needs three characters appended, append "XXX".
  • If the message needs four characters appended, append "YYYY".
  • If the message is already a multiple of five characters, append "ZZZZZ".

By using the padding described above, after decrypting the message, the recipient needs to only look at the last character to determine exactly how many characters make up the padding, and to strip from the plaintext.

To illustrate this, let's take an unkeyed deck, add a 5-character initialization vector, and encrypt the message "ATTACK AT DAWN". This message is only 12 characters, so I would need to add "XXX" at the end of the message according to the definition above. This my message becomes (removing spaces) "ATTACKATDAWNXXX". Adding the 5-character initialization vector "KEQPN" then encrypting, I get the following result:

plaintext: ATTACK AT DAWN
initialization vector: KEQPN
padding: XXXX


Of course, decrypting "KEQPN XLHTT PRUCA FHUEC" and removing the initialization vector "KEQPN" will reveal "ATTACKATDAWNXXX". It's clear to the recipient that "XXX" is padding, and can be stripped without affecting the plaintext.


This has been a lengthy post, and I commend you for reading this far. The Chaocipher is an interesting algorithm, and I'll be studying its properties as time moves forward. I think the Chaocipher fits well as playing card cipher, and gets as close to "bare metal" as you can without designing an actual mechanical mechanism with two rotating disks and removable character tiles. Playing cards are easy to carry around with you in your pocket, so its portability is nice.

Further, we can increase the strength of the algorithm, as mentioned, by adding an initialization vector at the start of the message, and by adding padding, we can stick with the standard of 5-character blocks in our ciphertext. Of course, this means adding 6-10 additional characters, but for a 160-character message, this doesn't seem too cumbersome.

There are some things that I have observed while using playing cards for the cipher hardware. First, encrypting and decrypting are slow. It takes me about a minute to encrypt/decrypt two-three characters. So, for a 160-character message, it could take the better part of an hour to work through.

Second, due to its slow speed, you may get tempted to try and speed things up a bit, so you can work through the message more quickly. However, this drastically opens you up to mistakes. I was encrypting the plaintext "JELLY LIKE ABOVE THE HIGHWIRE SIX QUACKING PACHYDERMS KEPT THE CLIMAX OF THE EXTRAVAGANZA IN A DAZZLING STATE OF FLUX" over and over. After about 10 ciphertexts, I wrote a Python script to automate the process for me. Only 1 of the ciphertexts was 100% accurate, character-for-character. And, unfortunately, 1 ciphertext was 0% accurate, with every character in the message incorrect. However, on the other 8 messages, I seemed to maintain accuracy for about 2/3 of the characters on most messages. Some others, I made a mistake more early on. Regardless, the point is, I was making frequent mistakes, despite my best effort to not do so. Only 1 out of 10 ciphertexts would decrypt cleanly. It might be worth having two decks, one for producing the ciphertext character, and one for double-checking your work. Of course, this slows you down further, but could be doable for minimizing mistakes.

However, the Chaocipher with playing cards is a fun cipher to work, and easy once you get the hang of it. I would recommend using plastic playing cards, such as the ones from Kem, Copag, or Bicycle Prestige. This way, the cards don't get gummed up like paper cards, are washable, last longer due to their extra durability, and overall, just are a better feeling playing card.

If you work the Chaocipher with playing cards, let me know what you think.

The Kidekin TRNG Hardware Random Number Generator

Yesterday, I received my Kidekin TRNG hardware random number generator. I was eager to purchase this, because on the Tindie website, the first 2 people to purchase the RNG would get $50 off, making the device $30 total. I quickly ordered one. Hilariously enough, I received a letter from the supplier that I was their first customer! Hah!

Image of the Kidekin Digital TRNG

Upon opening the package, I noticed the size of the TRNG. It's roughly 10.5 cm from end-to-end which makes it somewhat awkward for a device sitting in your USB port on your laptop. It would work fine sitting in the back of a desktop or server, out of the way, but on my Thinkpad T61, it's a bit large to be sitting there 24/7 feeding my kernel CSPRNG.

Plugging the device in, the kernel actually sees two USB devices, not just one, and sets them up as /dev/ttyUSB0 and /dev/ttyUSB1. Curious. Downloading the software ZIP file from their webpage, and looking through it, the following UDEV rules are provided:

$ cat /etc/udev/rules.d/98-kidekin.rules 
#SYMLINK+= method works on more systems, if it does not on your system, please switch to the NAME= method.

#disable the unused port.
#SUBSYSTEM=="tty", ATTRS{interface}=="kidekin_trng", ATTRS{bInterfaceNumber}=="00", NAME="kidekin_dont_use", MODE="0000", ENV{ID_MM_DEVICE_IGNORE}="1", ENV{ID_MM_CANDIDATE}="0"
SUBSYSTEM=="tty", ATTRS{interface}=="kidekin_trng", ATTRS{bInterfaceNumber}=="00", SYMLINK+="kidekin_dont_use", MODE="0000", ENV{ID_MM_DEVICE_IGNORE}="1", ENV{ID_MM_CANDIDATE}="0"

#connect kidekin TRNG to /dev/random
#SUBSYSTEM=="tty", ATTRS{interface}=="kidekin_trng", ATTRS{bInterfaceNumber}=="01", NAME="kidekin_trng", MODE="0777", RUN+="/bin/stty raw -echo -crtscts -F /dev/kidekin_trng speed 3000000", ENV{ID_MM_DEVICE_IGNORE}="1", ENV{ID_MM_CANDIDATE}="0"
SUBSYSTEM=="tty", ATTRS{interface}=="kidekin_trng", ATTRS{bInterfaceNumber}=="01", SYMLINK+="kidekin_trng", MODE="0777", RUN+="/bin/stty raw -echo -crtscts -F /dev/kidekin_trng speed 3000000", ENV{ID_MM_DEVICE_IGNORE}="1", ENV{ID_MM_CANDIDATE}="0"
SUBSYSTEM=="tty", ATTRS{interface}=="kidekin_trng", ATTRS{bInterfaceNumber}=="01", RUN+="/etc/init.d/rng-tools restart"

This is a bit assuming, and a bit overdoing it IMO, so I simplified it, and setup the following:

SUBSYSTEM=="tty", ATTRS{interface}=="kidekin_trng", ATTRS{bInterfaceNumber}=="01", SYMLINK+="kidekin", MODE="0777", RUN+="/bin/stty raw -echo -crtscts -F /dev/kidekin speed 3000000", ENV{ID_MM_DEVICE_IGNORE}="1", ENV{ID_MM_CANDIDATE}="0"

This avoids setting up a "do not use" symlink for the unnecessary USB device, and changes the symlink of the usable USB device to /dev/kidekin. This also doesn't restart rngd(8), as I'll administer that on my own. At this point, I am ready for testing.

First and foremost, I wanted to test its throughput:

$ dd if=/dev/kidekin count=1G | pv -a > /dev/null
[ 282KiB/s]

The device held stable at 282 KBps or roughly 2.2 Mbps. This is 75.2 KBps per dollar for my $30 purchase. Not bad.

The Kidekin is based on astable free running oscillators, or multivibrators. Unfortunately, a security proof does not accompany the device. So, while this may hold up to the suite of randomness tests, the output may not be cryptographically secure, and could also potentially be backdoored, as verifying the hardware is not easily doable. So, let's see if it at least holds up to the randomness tests. I created a 256 MB file, and ran the standard suites of tests:

$ dd if=/dev/kidekin of=entropy.kidekin bs=1M count=256 iflag=fullblock
256+0 records in
256+0 records out
268435456 bytes (268 MB) copied, 928.326 s, 289 kB/s

At this point, I can start my testing. First, let's quantify the amount of entropy per byte, as well as some basic tests with ent(1):

$ ent entropy.kidekin
Entropy = 7.999999 bits per byte.

Optimum compression would reduce the size
of this 268435456 byte file by 0 percent.

Chi square distribution for 268435456 samples is 248.92, and randomly
would exceed this value 59.56 percent of the times.

Arithmetic mean value of data bytes is 127.4924 (127.5 = random).
Monte Carlo value for Pi is 3.141825693 (error 0.01 percent).
Serial correlation coefficient is -0.000003 (totally uncorrelated = 0.0).

Everything good so far. How about the FIPS 140-2 tests for randomness:

$ rngtest < entropy.kidekin
rngtest 2-unofficial-mt.14
Copyright (c) 2004 by Henrique de Moraes Holschuh
This is free software; see the source for copying conditions.  There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

rngtest: starting FIPS tests...
rngtest: entropy source exhausted!
rngtest: bits received from input: 2147483648
rngtest: FIPS 140-2 successes: 107292
rngtest: FIPS 140-2 failures: 82
rngtest: FIPS 140-2(2001-10-10) Monobit: 14
rngtest: FIPS 140-2(2001-10-10) Poker: 13
rngtest: FIPS 140-2(2001-10-10) Runs: 26
rngtest: FIPS 140-2(2001-10-10) Long run: 30
rngtest: FIPS 140-2(2001-10-10) Continuous run: 0
rngtest: input channel speed: (min=317.891; avg=7386.982; max=19073.486)Mibits/s
rngtest: FIPS tests speed: (min=6.563; avg=109.376; max=114.901)Mibits/s
rngtest: Program run time: 19261018 microseconds
$ echo $?

Again, so far so good. Some failures are expected with random input of this size. 82 failures versus 107292 successes is right on par with the tests. Now the Dieharder battery of tests:

$ dieharder -a < entropy.kidekin
#            dieharder version 3.31.1 Copyright 2003 Robert G. Brown          #
   rng_name    |rands/second|   Seed   |
        mt19937|  8.99e+07  | 722892634|
        test_name   |ntup| tsamples |psamples|  p-value |Assessment
   diehard_birthdays|   0|       100|     100|0.87388974|  PASSED  
      diehard_operm5|   0|   1000000|     100|0.25081726|  PASSED  
  diehard_rank_32x32|   0|     40000|     100|0.80329585|  PASSED  
    diehard_rank_6x8|   0|    100000|     100|0.87234234|  PASSED  
   diehard_bitstream|   0|   2097152|     100|0.27873738|  PASSED  
        diehard_opso|   0|   2097152|     100|0.05958924|  PASSED  
        diehard_oqso|   0|   2097152|     100|0.10540020|  PASSED  
         diehard_dna|   0|   2097152|     100|0.30006047|  PASSED  
diehard_count_1s_str|   0|    256000|     100|0.43809130|  PASSED  
diehard_count_1s_byt|   0|    256000|     100|0.29758303|  PASSED  
 diehard_parking_lot|   0|     12000|     100|0.78081639|  PASSED  
    diehard_2dsphere|   2|      8000|     100|0.58294587|  PASSED  
    diehard_3dsphere|   3|      4000|     100|0.04012616|  PASSED  
     diehard_squeeze|   0|    100000|     100|0.97651988|  PASSED  
        diehard_sums|   0|       100|     100|0.01875349|  PASSED  
        diehard_runs|   0|    100000|     100|0.17566659|  PASSED  
        diehard_runs|   0|    100000|     100|0.78887310|  PASSED  
       diehard_craps|   0|    200000|     100|0.16369886|  PASSED  
       diehard_craps|   0|    200000|     100|0.42148915|  PASSED  
 marsaglia_tsang_gcd|   0|  10000000|     100|0.27534860|  PASSED  
 marsaglia_tsang_gcd|   0|  10000000|     100|0.45190499|  PASSED  
         sts_monobit|   1|    100000|     100|0.88204376|  PASSED  
            sts_runs|   2|    100000|     100|0.15277754|  PASSED  
          sts_serial|   1|    100000|     100|0.71489026|  PASSED  
          sts_serial|   2|    100000|     100|0.85005457|  PASSED  
          sts_serial|   3|    100000|     100|0.77631916|  PASSED  
          sts_serial|   3|    100000|     100|0.81111751|  PASSED  
          sts_serial|   4|    100000|     100|0.72512842|  PASSED  
          sts_serial|   4|    100000|     100|0.68758000|  PASSED  
          sts_serial|   5|    100000|     100|0.69083583|  PASSED  
          sts_serial|   5|    100000|     100|0.09706031|  PASSED  
          sts_serial|   6|    100000|     100|0.52758972|  PASSED  
          sts_serial|   6|    100000|     100|0.27970465|  PASSED  
          sts_serial|   7|    100000|     100|0.07925569|  PASSED  
          sts_serial|   7|    100000|     100|0.25874891|  PASSED  
          sts_serial|   8|    100000|     100|0.33647659|  PASSED  
          sts_serial|   8|    100000|     100|0.80952471|  PASSED  
          sts_serial|   9|    100000|     100|0.99948911|   WEAK   
          sts_serial|   9|    100000|     100|0.32461849|  PASSED  
          sts_serial|  10|    100000|     100|0.69360795|  PASSED  
          sts_serial|  10|    100000|     100|0.96022345|  PASSED  
          sts_serial|  11|    100000|     100|0.91349333|  PASSED  
          sts_serial|  11|    100000|     100|0.95918606|  PASSED  
          sts_serial|  12|    100000|     100|0.69821905|  PASSED  
          sts_serial|  12|    100000|     100|0.57652285|  PASSED  
          sts_serial|  13|    100000|     100|0.28393582|  PASSED  
          sts_serial|  13|    100000|     100|0.45849491|  PASSED  
          sts_serial|  14|    100000|     100|0.30832853|  PASSED  
          sts_serial|  14|    100000|     100|0.89099315|  PASSED  
          sts_serial|  15|    100000|     100|0.87022105|  PASSED  
          sts_serial|  15|    100000|     100|0.06938123|  PASSED  
          sts_serial|  16|    100000|     100|0.79568629|  PASSED  
          sts_serial|  16|    100000|     100|0.53218489|  PASSED  
         rgb_bitdist|   1|    100000|     100|0.38552808|  PASSED  
         rgb_bitdist|   2|    100000|     100|0.79403454|  PASSED  
         rgb_bitdist|   3|    100000|     100|0.66811643|  PASSED  
         rgb_bitdist|   4|    100000|     100|0.84954470|  PASSED  
         rgb_bitdist|   5|    100000|     100|0.90198903|  PASSED  
         rgb_bitdist|   6|    100000|     100|0.98808244|  PASSED  
         rgb_bitdist|   7|    100000|     100|0.25730860|  PASSED  
         rgb_bitdist|   8|    100000|     100|0.43237015|  PASSED  
         rgb_bitdist|   9|    100000|     100|0.90916135|  PASSED  
         rgb_bitdist|  10|    100000|     100|0.81131338|  PASSED  
         rgb_bitdist|  11|    100000|     100|0.31361128|  PASSED  
         rgb_bitdist|  12|    100000|     100|0.40786889|  PASSED  
rgb_minimum_distance|   2|     10000|    1000|0.03358258|  PASSED  
rgb_minimum_distance|   3|     10000|    1000|0.99298827|  PASSED  
rgb_minimum_distance|   4|     10000|    1000|0.47721533|  PASSED  
rgb_minimum_distance|   5|     10000|    1000|0.86641982|  PASSED  
    rgb_permutations|   2|    100000|     100|0.10084049|  PASSED  
    rgb_permutations|   3|    100000|     100|0.99560585|   WEAK   
    rgb_permutations|   4|    100000|     100|0.42217190|  PASSED  
    rgb_permutations|   5|    100000|     100|0.95466090|  PASSED  
      rgb_lagged_sum|   0|   1000000|     100|0.64120688|  PASSED  
      rgb_lagged_sum|   1|   1000000|     100|0.22106106|  PASSED  
      rgb_lagged_sum|   2|   1000000|     100|0.41244281|  PASSED  
      rgb_lagged_sum|   3|   1000000|     100|0.98880097|  PASSED  
      rgb_lagged_sum|   4|   1000000|     100|0.78380177|  PASSED  
      rgb_lagged_sum|   5|   1000000|     100|0.25533777|  PASSED  
      rgb_lagged_sum|   6|   1000000|     100|0.78150371|  PASSED  
      rgb_lagged_sum|   7|   1000000|     100|0.53903267|  PASSED  
      rgb_lagged_sum|   8|   1000000|     100|0.04436257|  PASSED  
      rgb_lagged_sum|   9|   1000000|     100|0.77174302|  PASSED  
      rgb_lagged_sum|  10|   1000000|     100|0.54862612|  PASSED  
      rgb_lagged_sum|  11|   1000000|     100|0.48691334|  PASSED  
      rgb_lagged_sum|  12|   1000000|     100|0.06308057|  PASSED  
      rgb_lagged_sum|  13|   1000000|     100|0.42530804|  PASSED  
      rgb_lagged_sum|  14|   1000000|     100|0.86907366|  PASSED  
      rgb_lagged_sum|  15|   1000000|     100|0.66262930|  PASSED  
      rgb_lagged_sum|  16|   1000000|     100|0.85485044|  PASSED  
      rgb_lagged_sum|  17|   1000000|     100|0.39817394|  PASSED  
      rgb_lagged_sum|  18|   1000000|     100|0.90608610|  PASSED  
      rgb_lagged_sum|  19|   1000000|     100|0.94996515|  PASSED  
      rgb_lagged_sum|  20|   1000000|     100|0.78715690|  PASSED  
      rgb_lagged_sum|  21|   1000000|     100|0.93364519|  PASSED  
      rgb_lagged_sum|  22|   1000000|     100|0.84438533|  PASSED  
      rgb_lagged_sum|  23|   1000000|     100|0.77439531|  PASSED  
      rgb_lagged_sum|  24|   1000000|     100|0.12530311|  PASSED  
      rgb_lagged_sum|  25|   1000000|     100|0.79035917|  PASSED  
      rgb_lagged_sum|  26|   1000000|     100|0.93286961|  PASSED  
      rgb_lagged_sum|  27|   1000000|     100|0.32567247|  PASSED  
      rgb_lagged_sum|  28|   1000000|     100|0.39563718|  PASSED  
      rgb_lagged_sum|  29|   1000000|     100|0.15628693|  PASSED  
      rgb_lagged_sum|  30|   1000000|     100|0.69368810|  PASSED  
      rgb_lagged_sum|  31|   1000000|     100|0.00197963|   WEAK   
      rgb_lagged_sum|  32|   1000000|     100|0.23325783|  PASSED  
     rgb_kstest_test|   0|     10000|    1000|0.18940877|  PASSED  
     dab_bytedistrib|   0|  51200000|       1|0.57007834|  PASSED  
             dab_dct| 256|     50000|       1|0.76567665|  PASSED  
Preparing to run test 207.  ntuple = 0
        dab_filltree|  32|  15000000|       1|0.60537852|  PASSED  
        dab_filltree|  32|  15000000|       1|0.78894908|  PASSED  
Preparing to run test 208.  ntuple = 0
       dab_filltree2|   0|   5000000|       1|0.11775507|  PASSED  
       dab_filltree2|   1|   5000000|       1|0.34799105|  PASSED  
Preparing to run test 209.  ntuple = 0
        dab_monobit2|  12|  65000000|       1|0.69182598|  PASSED  

Finally, a visual check on the data, even though it's safe to assume that it's "true random" given the previous testing:

$ dd if=white.bmp of=entropy.kidekin bs=1 count=54 conv=notrunc
54+0 records in
54+0 records out
54 bytes (54 B) copied, 0.000547208 s, 98.7 kB/s
$ gimp entropy.kidekin # convert to grayscale, export as "entropy.png"
$ optipng entropy.png
** Processing: entropy.png
512x512 pixels, 8 bits/pixel, grayscale
Input IDAT size = 250107 bytes
Input file size = 250564 bytes

  zc = 9  zm = 8  zs = 0  f = 0		IDAT size = 215319
  zc = 9  zm = 8  zs = 1  f = 0		IDAT size = 214467
  zc = 1  zm = 8  zs = 2  f = 0		IDAT size = 214467
  zc = 9  zm = 8  zs = 3  f = 0		IDAT size = 214467
Selecting parameters:
  zc = 1  zm = 8  zs = 2  f = 0		IDAT size = 214467

Output IDAT size = 214467 bytes (35640 bytes decrease)
Output file size = 214564 bytes (36000 bytes = 14.37% decrease)

And the result is:

RNG visual output of the Kidekin TRNG

My conclusion of the Kidekin TRNG is positive. I love the throughput of the device, loved the price, and aside from the UDEV rule, it is plug-and-play. Unfortunately, the TRNG is a bit on the big side for a physical device, and because it doesn't come with a security proof, and the hardware design is closed, I would be skeptical to trust it for your random numbers directly. Instead, I would recommend adding it the Linux kernel's CSPRNG, and rely on /dev/urandom instead. This is trivial with rngd(8). But, overall, I am very pleased with the device, and which I had actually purchased a second one.

Additional Testing Of The rtl-sdr Dongle As A HWRNG

A couple days ago, I put up a post about using the Realtek SDR dongles as a hardware true random number generator. I only tested the randomness of a 512 MB file. I thought this time, I would but a bit more stock into it. In this case, I let it run for a while, until it was 1.8 GB in size. Interestingly enough, it stopped getting bigger after that point. Not sure why. However, I ran more tests on that 1.8 GB file. Creating this file took its time:

$ tail -f /run/rtl_entropy.fifo | dd of=random.img iflag=fullblock
3554130+0 records in
3554130+0 records out
1819714560 bytes (1.8 GB) copied, 3897.22 s, 467 kB/s

This filled up a bit faster than I had previously tested, going at a clip of about 3.826 Mbps.

Now it was time for the testing:

$ ent random.img
Entropy = 8.000000 bits per byte.

Optimum compression would reduce the size
of this 1819714560 byte file by 0 percent.

Chi square distribution for 1819714560 samples is 246.86, and randomly
would exceed this value 63.11 percent of the times.

Arithmetic mean value of data bytes is 127.4990 (127.5 = random).
Monte Carlo value for Pi is 3.141611317 (error 0.00 percent).
Serial correlation coefficient is 0.000013 (totally uncorrelated = 0.0).

It passes with flying colors on entropy estimation, compression, chi-square distributions, arithmetic mean, the Monte Carlo estimation for Pi, and serial correlation. Testing further, I ran it through the FIPS 140-2 tests:

$ rngtest < random.img
rngtest 2-unofficial-mt.14
Copyright (c) 2004 by Henrique de Moraes Holschuh
This is free software; see the source for copying conditions.  There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

rngtest: starting FIPS tests...
rngtest: entropy source exhausted!
rngtest: bits received from input: 14557716480
rngtest: FIPS 140-2 successes: 727288
rngtest: FIPS 140-2 failures: 597
rngtest: FIPS 140-2(2001-10-10) Monobit: 99
rngtest: FIPS 140-2(2001-10-10) Poker: 57
rngtest: FIPS 140-2(2001-10-10) Runs: 210
rngtest: FIPS 140-2(2001-10-10) Long run: 233
rngtest: FIPS 140-2(2001-10-10) Continuous run: 0
rngtest: input channel speed: (min=114.212; avg=6626.942; max=9536.743)Mibits/s
rngtest: FIPS tests speed: (min=61.133; avg=147.877; max=151.377)Mibits/s
rngtest: Program run time: 96034230 microseconds
You have new mail.
$ echo $?

Finally, the beast of beasts, I ran it through every Dieharder test. This took some time to complete. Here is a listing of the tests that it went through:

$ dieharder -l
#            dieharder version 3.31.1 Copyright 2003 Robert G. Brown          #
Installed dieharder tests:
 Test Number	                     Test Name	              Test Reliability
  -d 0  	                  Diehard Birthdays Test	      Good
  -d 1  	                     Diehard OPERM5 Test	      Good
  -d 2  	          Diehard 32x32 Binary Rank Test	      Good
  -d 3  	            Diehard 6x8 Binary Rank Test	      Good
  -d 4  	                  Diehard Bitstream Test	      Good
  -d 5  	                            Diehard OPSO	   Suspect
  -d 6  	                       Diehard OQSO Test	   Suspect
  -d 7  	                        Diehard DNA Test	   Suspect
  -d 8  	      Diehard Count the 1s (stream) Test	      Good
  -d 9  	        Diehard Count the 1s Test (byte)	      Good
  -d 10  	                Diehard Parking Lot Test	      Good
  -d 11  	Diehard Minimum Distance (2d Circle) Test	      Good
  -d 12  	Diehard 3d Sphere (Minimum Distance) Test	      Good
  -d 13  	                    Diehard Squeeze Test	      Good
  -d 14  	                       Diehard Sums Test	Do Not Use
  -d 15  	                       Diehard Runs Test	      Good
  -d 16  	                      Diehard Craps Test	      Good
  -d 17  	            Marsaglia and Tsang GCD Test	      Good
  -d 100  	                        STS Monobit Test	      Good
  -d 101  	                           STS Runs Test	      Good
  -d 102  	           STS Serial Test (Generalized)	      Good
  -d 200  	               RGB Bit Distribution Test	      Good
  -d 201  	   RGB Generalized Minimum Distance Test	      Good
  -d 202  	                   RGB Permutations Test	      Good
  -d 203  	                     RGB Lagged Sum Test	      Good
  -d 204  	        RGB Kolmogorov-Smirnov Test Test	      Good
  -d 205  	                       Byte Distribution	      Good
  -d 206  	                                 DAB DCT	      Good
  -d 207  	                      DAB Fill Tree Test	      Good
  -d 208  	                    DAB Fill Tree 2 Test	      Good
  -d 209  	                      DAB Monobit 2 Test	      Good

So here are the results:

 $ dieharder -a < random.img
#            dieharder version 3.31.1 Copyright 2003 Robert G. Brown          #
   rng_name    |rands/second|   Seed   |
        mt19937|  1.25e+08  | 169223456|
        test_name   |ntup| tsamples |psamples|  p-value |Assessment
   diehard_birthdays|   0|       100|     100|0.91937112|  PASSED  
      diehard_operm5|   0|   1000000|     100|0.77213572|  PASSED  
  diehard_rank_32x32|   0|     40000|     100|0.04709503|  PASSED  
    diehard_rank_6x8|   0|    100000|     100|0.93031877|  PASSED  
   diehard_bitstream|   0|   2097152|     100|0.12183977|  PASSED  
        diehard_opso|   0|   2097152|     100|0.96023625|  PASSED  
        diehard_oqso|   0|   2097152|     100|0.61237304|  PASSED  
         diehard_dna|   0|   2097152|     100|0.66045974|  PASSED  
diehard_count_1s_str|   0|    256000|     100|0.16999968|  PASSED  
diehard_count_1s_byt|   0|    256000|     100|0.00992823|  PASSED  
 diehard_parking_lot|   0|     12000|     100|0.69592283|  PASSED  
    diehard_2dsphere|   2|      8000|     100|0.95358410|  PASSED  
    diehard_3dsphere|   3|      4000|     100|0.89028448|  PASSED  
     diehard_squeeze|   0|    100000|     100|0.81631204|  PASSED  
        diehard_sums|   0|       100|     100|0.03559934|  PASSED  
        diehard_runs|   0|    100000|     100|0.75027140|  PASSED  
        diehard_runs|   0|    100000|     100|0.43076351|  PASSED  
       diehard_craps|   0|    200000|     100|0.57749359|  PASSED  
       diehard_craps|   0|    200000|     100|0.00599436|  PASSED  
 marsaglia_tsang_gcd|   0|  10000000|     100|0.60121369|  PASSED  
 marsaglia_tsang_gcd|   0|  10000000|     100|0.04254338|  PASSED  
         sts_monobit|   1|    100000|     100|0.94352358|  PASSED  
            sts_runs|   2|    100000|     100|0.77549833|  PASSED  
          sts_serial|   1|    100000|     100|0.46198961|  PASSED  
          sts_serial|   2|    100000|     100|0.46002706|  PASSED  
          sts_serial|   3|    100000|     100|0.73076110|  PASSED  
          sts_serial|   3|    100000|     100|0.90967100|  PASSED  
          sts_serial|   4|    100000|     100|0.32002297|  PASSED  
          sts_serial|   4|    100000|     100|0.07478887|  PASSED  
          sts_serial|   5|    100000|     100|0.27486408|  PASSED  
          sts_serial|   5|    100000|     100|0.57409336|  PASSED  
          sts_serial|   6|    100000|     100|0.05095556|  PASSED  
          sts_serial|   6|    100000|     100|0.06341272|  PASSED  
          sts_serial|   7|    100000|     100|0.00941089|  PASSED  
          sts_serial|   7|    100000|     100|0.53679805|  PASSED  
          sts_serial|   8|    100000|     100|0.00122125|   WEAK   
          sts_serial|   8|    100000|     100|0.16239101|  PASSED  
          sts_serial|   9|    100000|     100|0.24007712|  PASSED  
          sts_serial|   9|    100000|     100|0.02659941|  PASSED  
          sts_serial|  10|    100000|     100|0.64616186|  PASSED  
          sts_serial|  10|    100000|     100|0.78783799|  PASSED  
          sts_serial|  11|    100000|     100|0.77618602|  PASSED  
          sts_serial|  11|    100000|     100|0.33875893|  PASSED  
          sts_serial|  12|    100000|     100|0.50423715|  PASSED  
          sts_serial|  12|    100000|     100|0.77528158|  PASSED  
          sts_serial|  13|    100000|     100|0.57625144|  PASSED  
          sts_serial|  13|    100000|     100|0.73422196|  PASSED  
          sts_serial|  14|    100000|     100|0.40891605|  PASSED  
          sts_serial|  14|    100000|     100|0.48542772|  PASSED  
          sts_serial|  15|    100000|     100|0.67319390|  PASSED  
          sts_serial|  15|    100000|     100|0.74730027|  PASSED  
          sts_serial|  16|    100000|     100|0.67519158|  PASSED  
          sts_serial|  16|    100000|     100|0.73171087|  PASSED  
         rgb_bitdist|   1|    100000|     100|0.87216594|  PASSED  
         rgb_bitdist|   2|    100000|     100|0.18831902|  PASSED  
         rgb_bitdist|   3|    100000|     100|0.16757216|  PASSED  
         rgb_bitdist|   4|    100000|     100|0.05327115|  PASSED  
         rgb_bitdist|   5|    100000|     100|0.75278396|  PASSED  
         rgb_bitdist|   6|    100000|     100|0.64749144|  PASSED  
         rgb_bitdist|   7|    100000|     100|0.20311557|  PASSED  
         rgb_bitdist|   8|    100000|     100|0.39994123|  PASSED  
         rgb_bitdist|   9|    100000|     100|0.52805289|  PASSED  
         rgb_bitdist|  10|    100000|     100|0.96091722|  PASSED  
         rgb_bitdist|  11|    100000|     100|0.97794399|  PASSED  
         rgb_bitdist|  12|    100000|     100|0.75009561|  PASSED  
rgb_minimum_distance|   2|     10000|    1000|0.58923867|  PASSED  
rgb_minimum_distance|   3|     10000|    1000|0.54294743|  PASSED  
rgb_minimum_distance|   4|     10000|    1000|0.59446131|  PASSED  
rgb_minimum_distance|   5|     10000|    1000|0.00047025|   WEAK   
    rgb_permutations|   2|    100000|     100|0.89040191|  PASSED  
    rgb_permutations|   3|    100000|     100|0.47917416|  PASSED  
    rgb_permutations|   4|    100000|     100|0.30964668|  PASSED  
    rgb_permutations|   5|    100000|     100|0.70217495|  PASSED  
      rgb_lagged_sum|   0|   1000000|     100|0.12796648|  PASSED  
      rgb_lagged_sum|   1|   1000000|     100|0.15077254|  PASSED  
      rgb_lagged_sum|   2|   1000000|     100|0.31141471|  PASSED  
      rgb_lagged_sum|   3|   1000000|     100|0.94974697|  PASSED  
      rgb_lagged_sum|   4|   1000000|     100|0.99256987|  PASSED  
      rgb_lagged_sum|   5|   1000000|     100|0.67854004|  PASSED  
      rgb_lagged_sum|   6|   1000000|     100|0.08600877|  PASSED  
      rgb_lagged_sum|   7|   1000000|     100|0.91633363|  PASSED  
      rgb_lagged_sum|   8|   1000000|     100|0.06794590|  PASSED  
      rgb_lagged_sum|   9|   1000000|     100|0.59024027|  PASSED  
      rgb_lagged_sum|  10|   1000000|     100|0.59285975|  PASSED  
      rgb_lagged_sum|  11|   1000000|     100|0.87178336|  PASSED  
      rgb_lagged_sum|  12|   1000000|     100|0.63401541|  PASSED  
      rgb_lagged_sum|  13|   1000000|     100|0.47202172|  PASSED  
      rgb_lagged_sum|  14|   1000000|     100|0.34616699|  PASSED  
      rgb_lagged_sum|  15|   1000000|     100|0.97221211|  PASSED  
      rgb_lagged_sum|  16|   1000000|     100|0.95576739|  PASSED  
      rgb_lagged_sum|  17|   1000000|     100|0.32367098|  PASSED  
      rgb_lagged_sum|  18|   1000000|     100|0.92792046|  PASSED  
      rgb_lagged_sum|  19|   1000000|     100|0.58128429|  PASSED  
      rgb_lagged_sum|  20|   1000000|     100|0.78197001|  PASSED  
      rgb_lagged_sum|  21|   1000000|     100|0.86068846|  PASSED  
      rgb_lagged_sum|  22|   1000000|     100|0.22496908|  PASSED  
      rgb_lagged_sum|  23|   1000000|     100|0.52387665|  PASSED  
      rgb_lagged_sum|  24|   1000000|     100|0.52748770|  PASSED  
      rgb_lagged_sum|  25|   1000000|     100|0.96442902|  PASSED  
      rgb_lagged_sum|  26|   1000000|     100|0.51298847|  PASSED  
      rgb_lagged_sum|  27|   1000000|     100|0.99123470|  PASSED  
      rgb_lagged_sum|  28|   1000000|     100|0.69774674|  PASSED  
      rgb_lagged_sum|  29|   1000000|     100|0.83646714|  PASSED  
      rgb_lagged_sum|  30|   1000000|     100|0.98573851|  PASSED  
      rgb_lagged_sum|  31|   1000000|     100|0.23580471|  PASSED  
      rgb_lagged_sum|  32|   1000000|     100|0.19150884|  PASSED  
     rgb_kstest_test|   0|     10000|    1000|0.67771558|  PASSED  
     dab_bytedistrib|   0|  51200000|       1|0.07152541|  PASSED  
             dab_dct| 256|     50000|       1|0.53841656|  PASSED  
Preparing to run test 207.  ntuple = 0
        dab_filltree|  32|  15000000|       1|0.09092747|  PASSED  
        dab_filltree|  32|  15000000|       1|0.83382174|  PASSED  
Preparing to run test 208.  ntuple = 0
       dab_filltree2|   0|   5000000|       1|0.37363586|  PASSED  
       dab_filltree2|   1|   5000000|       1|0.26890999|  PASSED  
Preparing to run test 209.  ntuple = 0
        dab_monobit2|  12|  65000000|       1|0.80810458|  PASSED  

I don't have an image to look at to visually verify that there are no obvious patterns. At 1.8 GB, I feel that it would be just a bit too unwieldy anyway. So, I'll need to trust the previous tests for randomness that the data really is random. After these 3 series of tests, I can only conclude that using a Realtek SDR as a HWRNG will generate as "true random" data as you can hope for.

Hardware RNG Through an rtl-sdr Dongle

An rtl-sdr dongle allows you to receive radio frequency signals to your computer through a software interface. You can listen to Amateur Radio, watch analog television, listen to FM radio broadcasts, and a number of other things. I have a friend to uses it to monitor power usage at his house. However, I have a different use- true random number generation.

The theory behind the RNG is by taking advantage of radio frequency noise such as atmospheric noise. which is caused by natural occurrences, such as weak galactic radiation from the center of our Milky Way Galaxy to the stronger local and remote lightning strikes. It's estimated that roughly 40 lightning strikes are hitting the Earth every second, which equates to about 3.5 million strikes per 24 hour period. Interestingly enough, this provides a great deal of entropy for a random number generator.

Check out Blitzortung. It is a community run site, where volunteers can setup lightning monitoring stations and submit data to the server. Of course, it isn't an accurate picture of the entire globe, but you can at least get some idea of the scope of lightning strikes around the continents.

Lightning Map of the United States

Unfortunately, however, the rtl-sdr dongle won't get down to the frequencies necessary for sampling atmospheric noise; about 100 KHz to 10 MHz, and above 10 GHz. However, it can sample cosmic noise, man-made (urban and suburban) noise, solar noise, thermal noise, and other terrestrial noises that are well within the tuner frequency range of the dongle.

In order to take advantage of this, you obviously need an rtl-sdr dongle. They're quite cheap, about $15 or so, and plug in via USB with an external antenna. Of course, the larger the antenna, the more terrestrial noise you'll be able to observe. With a standard telescoping antenna, I can observe about 3 Mbps of true random data.

The other piece, however, will be compiling and installing the rtl-entropy software. This will provide a FIFO file for observing the random data. Reading the random data can be done as you would read any regular file:

$ sudo rtl_entropy -b -f 74M
$ tail -f /run/rtl_entropy.fifo | dd of=/dev/null
^C8999+10 records in
9004+0 records out
4610048 bytes (4.6 MB) copied, 13.294 s, 347 kB/s

That's roughly 2.8 Mbps. Not bad for $15. Notice, that I passed the "-b" switch to detach the PID from the controlling TTY and background. Further, I am not tuning to the default frequency of 70 MHz, which is part of Band I in the North America band plan for television broadcasting. Instead, I am tuning to 74 MHz, which is in the middle of a break in the band plan, where no television broadcasting should be transmitted. Of course, you'll need to make sure you are tuning to a frequency that is less likely to encounter malicious interference. Even though the rtl_entropy daemon has built-in debiasing and FIPS randomness testing, a malicious source could interrupt with the operation of the output by transmitting on the frequency that you are listening to.

In order to guarantee that you have random data, you should send it through a battery of standardized tests for randomness. One popular test for randomness are the FIPS 140-2 tests. Suppose I create a 512 MB file from my sdr-rtl dongle, I can test it as follows:

$ rngtest < random.img
rngtest 2-unofficial-mt.14
Copyright (c) 2004 by Henrique de Moraes Holschuh
This is free software; see the source for copying conditions.  There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

rngtest: starting FIPS tests...
rngtest: entropy source exhausted!
rngtest: bits received from input: 83886080
rngtest: FIPS 140-2 successes: 4190
rngtest: FIPS 140-2 failures: 4
rngtest: FIPS 140-2(2001-10-10) Monobit: 0
rngtest: FIPS 140-2(2001-10-10) Poker: 1
rngtest: FIPS 140-2(2001-10-10) Runs: 1
rngtest: FIPS 140-2(2001-10-10) Long run: 2
rngtest: FIPS 140-2(2001-10-10) Continuous run: 0
rngtest: input channel speed: (min=174.986; avg=4379.165; max=4768.372)Mibits/s
rngtest: FIPS tests speed: (min=113.533; avg=147.777; max=150.185)Mibits/s
rngtest: Program run time: 560095 microseconds

It's expected to see some failures, but they should be outliers. There is also the Dieharder battery of randomness tests. This will take substantially longer to work through, but it can be done. Here are the first few lines:

$ dieharder -a < random.img 
#            dieharder version 3.31.1 Copyright 2003 Robert G. Brown          #
   rng_name    |rands/second|   Seed   |
        mt19937|  1.30e+08  | 334923062|
        test_name   |ntup| tsamples |psamples|  p-value |Assessment
   diehard_birthdays|   0|       100|     100|0.98331589|  PASSED  
      diehard_operm5|   0|   1000000|     100|0.12201131|  PASSED  
  diehard_rank_32x32|   0|     40000|     100|0.69993313|  PASSED  
    diehard_rank_6x8|   0|    100000|     100|0.55365877|  PASSED  
   diehard_bitstream|   0|   2097152|     100|0.85077208|  PASSED  
        diehard_opso|   0|   2097152|     100|0.76171650|  PASSED  

The whole dieharder results of my 512 MB random file can be found here.

Last, but not least, it helps to observe the data visually. In this image, I created a plain white file in Gimp, that was 600x600 pixels in size. I then counted the number of bytes in that file, and generated an equally sized random binary data file. Finally, I added the bitmap header to the file, converted it to a PNG file, optimized it, and uploaded it here. The steps are as follows:

$ gimp # create 600x600px plain white file and save as 16-bit "white.bmp"
$ ls -l white.bmp | awk '{print $5}'
$ tail -f /run/rtl_entropy.fifo| dd of=random.img bs=1 count=720138 iflag=fullblock
720138+0 records in
720138+0 records out
720138 bytes (720 kB) copied, 24.8033 s, 29.0 kB/s
$ dd if=white.bmp of=random.img bs=1 count=54 conv=notrunc
$ gimp random.img # export to PNG file

When viewing the output, there should be no obvious patterns in the output. As an example:

Visual representation of random

For more practical use, here is a quick application for generating 80-bit entropy unambiguous passwords:

$ for i in {1..10}; do
> strings /run/rtl_entropy.fifo | grep -o '[a-hjkmnp-z2-9.]' | head -n 16 | tr -d '\n'; echo
> done

Obviously, the practical uses here can be for Monte Carlo simulations, game theory, gambling, cryptography, and other practical uses where high quality randomness is needed. Unfortunately, I can seem to get rngd(8) to add the /run/rtl_entropy.fifo file as a hardware device. So, I can't feed the Linux CSPRNG with with the dongle, other than "dd if=/run/rtl_entropy.fifo of=/dev/random", which doesn't increase the entropy estimate, of course.

Encrypting Combination Locks

This morning, my family and I went swimming at the community swimming center. Unfortunately, I couldn't find my key-based lock that I normally take. However, I did find my Master combination lock, but couldn't recall the combination. Fortunately, I knew how to find it. I took this lock with me to lock my personal items in the locker while swimming around in the pool.

While swimming, I started thinking about ways to better recall lock combinations in the future. The obvious choice is to encrypt it, so I could engrave the encrypted combination on the lock. However, it needs to be simple enough to do in my head should I temporarily forget it while swimming, and easy enough to recall if I haven't used the lock in a few years. Thankfully, this can be done easily enough with modulo addition and subtraction.

Before beginning, you need a 6-digit PIN that you won't easily forget. Tempting enough, dates can easily be in 6-digits, and something like a birthday or an anniversary are not hard to remember. Unfortunately, if someone knows you, and knows these dates, they can easily reverse the process to open the lock. So, as tempting as dates are, don't use them. Instead, you should probably use a 6-digit PIN, that only you would know, and always know. So, knowing this, let's see how this works.

You need to be familiar with modulus math, aka "clock math". The idea, is that after a certain maximum, the numbers reset back to 0. For example, 00:00 is midnight while 23:59 is the minute before. As soon as the hour is "24", then it resets back to 0, for a full 24-hour day. You could call telling time "mod 24 math". For combination locks, we're going to be using "mod 40 math", if the maximum number on your combination lock is "40", on "mod 60 math" if the max is "60", and so forth.

Suppose the combination to your lock is "03-23-36", and suppose your 6-digit PIN is "512133". Let's encrypt the combination with our PIN, by using "mod 40 subtraction". We'll use subtraction now, because most people have an easier time with addition than subtraction. When you are trying to rediscover your combination, you'll take your encrypted number, and do "mod 40 addition" to reverse it, and bring it back to the original combination lock numbers.

Here it is in action:

Encrypting the original combination

  03 23 36    <- original combination
- 51 21 33    <- secret PIN
= --------
 -48 02 03
= --------
  32 02 03    <- encrypted after "mod 40"

Because the first number in our combination is "03", and we are subtracting off "51", we end up with "-48". As such, we need to add "40" until our target new number is in the range of [0, 40), or "0 <= n < 40". This gives us "32" as the result. The rest of the numbers fell within that range, so no adjusting was necessary. I can then engrave "32-02-03" on the bottom of the lock, so when I hold the lock up while in a locker, the text is readable. Okay, that's all fine and dandy, but what about reversing it? Taking the encrypted combination, and returning to the original combination? This is where "mod 40 addition" comes in. For example:

Decrypting the encrypted combination

  32 02 03    <- encrypted combination
+ 51 21 33    <- secret PIN
= --------
  83 23 36
= --------
  03 23 36    <- original combination after "mod 40"

Notice that this time, the first number in our "mod 40 addition" is "83". So, we subtract of "40" until our original combination number is in the range of [0,40), or "0 <= n < 40", just like when doing "mod 40 subtraction" to create the new combination lock values. At worst case, you'll have to subtract a "40" only three times per number. On thing to watch out for, is that your encrypted combination numbers are far enough away from the original, that trying out the encrypted combination, won't accidentally open the lock, due to their proximity to the original numbers. If only one number is substantially off, that should be good enough to prevent an accidental opening. I want to come back to dates however, and why not to use them. Not only do they fall victim to a targeted attack, but they also have an exceptionally small key space. Assuming there are only 365 days per year, and assuming the attacker has a good idea of your age, plus or minus five years, that's a total of 3,650 total keys that must be tried following the common convention of "MM-DD-YY". It could be greatly reduced, if the attacker has a better handle on when you were born. If a 6-digit PIN is chosen instead, then the search space has 1,000,000 possible PINs. This is greater than the 64,000 possible maximum combination numbers a 40-digit Master lock could have, which puts the attacker on a brute force search for the original combination, if they aren't aware that Master combination locks can be broken in 8 tries or less.

The Lagged Fibonacci Generator

Lately, I have been studying pseudorandom number generators (PRNGs, also called "deterministic random bit generators", or DRBGs). I've been developing cryptographically secure PRNGs (CSPRNGs), and you can see my progress on Github at This project is for nothing more than for me to somewhat get a feeling for new languages, while also learning a thing or two about applied cryptograhpy. However, for the subject of this post, I want to address one PRNG that is not cryptographically secure- the Lagged Fibonacci Generator.

What drew me to this generator was thinking about a way to have a PRNG to do by hand. I started thinking about different ways to construct a PRNG mathematically. But, before creating an algorithm, I needed to identify all the points that make a good PRNG. A good PRNG should have:

  • An easy implementation.
  • High efficiency in calculating the pseudorandom values.
  • Long (practically un-observable) periods for most, if not all initial seeds.
  • A uniform distribution over the finite space.
  • No correlation between successive values.

I put a great deal of thought into it, but couldn't come up with anything I was very proud of. I thought of using trigonometric functions, various logarithm functions, geometric and algebraic expressions, and even fancy equations using derivatives. The more I thought about it, the further away I drifted from something simple that could be done by hand with pencil and paper.

The best I came up with, which required using a scientific calculator, was forcing the sequence to grow (a monotonically increasing function), then forcing it into a finite field with a modulus. However, no matter what I threw at it, I always struggled with either dealing with "0" or "1". For example, taking the n-th exponent of either "0" or "1" will always return a "0" or "1". I realized quickly that multiplication might be a problem. For example, one thought I had was the following:

Si = Floor[(Si-1)3/2], mod M

This works out fine, until your output is a "0" or "1", then the generator sits on either of those numbers indefinitely. I realized that my function should probably just stick with addition, or I'm bound to get myself into trouble. I thought, and thought about it, then it hit me. It was my "Beautiful Mind" moment.

I thought of the Fibonacci sequence.

The Fibonacci sequence is monotonically increasing for two seeds S1 and S2, where 0 < S1 < S2. If you put an upper bound on the sequence via a modulus, you can limit it to a finite space, and I can have my PRNG. However, I also know that the distance between any two sequential digits in the Fibonacci sequence approaches the Golden Ratio Phi. I'm not sure how this would affect my simple PRNG, and if a correlation between successive digits could be identified, but I started scribbling down numbers on a text pad anyway.

Immediately, however, I found something interesting: If both seeds are even, then the whole sequence of numbers would be even. For example, take the following Fibonacci PRNG:

S1 = 6, S2 = 8, mod 10
6 8 4 2 6 8 4 2 6 8 4 2 ...

There are two problems happening here- first, the period of the PRNG is 4 digits- 6, 8, 4, & 2. Second, because even numbers were chosen for the seeds, even numbers are the only possibility for the PRNG. So, either one of the seeds or the modulus must be odd, or the PRNG algorithm needs to be modified.

At this point, I threw my hands up in the air, and said "screw it". I decided to see what history had discovered with simple PRNGs. Turns out, I wasn't far off. A Fibonacci sequence PRNG exists called the Lagged Fibonacci Generator. Here is how it works:

Sn = Sn-j ⊙ Sn-k mod M, 0 < j < k

Where "⊙" is any binary function, such as addition, subtraction, multiplication, or even the bitwise exclusive-or.

First off, it doesn't address the "all evens" problem with my naive generator. If addition is used to calculate the values, then at least one number in the seed must be odd. If multiplication is used, then at least k-elements must be odd. However, what is interesting about this generator, is that rather than picking the first and second elements of the list to calculate the random value (Si-1 and Si-2), any j-th and k-th items in the list can be used (Si-j and Si-k). However, you must have at least k-elements in the list as your seed before beginning the algorithm.

To simplify things, lets pick "j=3" and "k=7" mod 10 addition. I need at least seven elements in the list, and at least one of them must be odd. I've always like the phone number "867-5309", so let's use that as our seed. Thus, the first 10 steps of our generator would look like this:

j=3, k=7, mod 10 addition

        [j]       [k]
 1. 8 6 [7] 5 3 0 [9] => 7+9 = 6 mod 10
 2. 6 7 [5] 3 0 9 [6] => 5+6 = 1 mod 10
 3. 7 5 [3] 0 9 6 [1] => 3+1 = 4 mod 10
 4. 5 3 [0] 9 6 1 [4] => 0+4 = 4 mod 10
 5. 3 0 [9] 6 1 4 [4] => 9+4 = 3 mod 10
 6. 0 9 [6] 1 4 4 [3] => 6+3 = 9 mod 10
 7. 9 6 [1] 4 4 3 [9] => 1+9 = 0 mod 10
 8. 6 1 [4] 4 3 9 [0] => 4+0 = 4 mod 10
 9. 1 4 [4] 3 9 0 [4] => 4+4 = 8 mod 10
10. 4 4 [3] 9 0 4 [8] => 3+8 = 1 mod 10

Generated: 6 1 4 4 3 9 0 4 8 1

The following Python code should verify our results:

j = 3
k = 7
s = [8, 6, 7, 5, 3, 0, 9]
for n in xrange(10):
    for i in xrange(len(s)):
        if i is 0:
            out = (s[j-1] + s[k-1]) % 10 # the pseudorandom output
        elif 0 < i < 6:
            s[i] = s[i+1] # shift the array
            s[i] = out
            print s[i], # print the result

Running it verifies our results:

$ python
6 1 4 4 3 9 0 4 8 1

It's a "lagged" generator, because "j" and "k" lag behind the generated pseudorandom value. Also, this is called a "two-tap" generator, in that you are using 2 values in the sequence to generate the pseudorandom number. However, a two-tap generator has some problems with randomness tests, such as the Birthday Spacings. Apparently, creating a "three-tap" generator addresses this problem. Such a generator would look like:

Sn = Sn-j ⊙ Sn-k ⊙ Sn-l mod M, 0 < j < k < l

Even though this generator isn't cryptographically secure (hint: it's linear), it meets the above requirements for a good PRNG, provided the "taps" are chosen carefully (the lags are exponents of a primitive polynomial), and the modulus is our traditional "power-of-2" (2M, such as 232 or 264). Supposing we are using a two-tap LFG, it would have a maximum period of:

(2k-1)*k    if exclusive-or is used
(2k-1)*2M-1 if addition or subtraction is used
(2k-1)*2M-3 if multiplication is used (1/4 of period of the additive case)

For a good LFG, it is found that a three-tap generator should be used, as a 3-element spacing correlation can be found in two-tap generators, and that initial taps should be very high for a large modulus. Further, the full mathematical theory hasn't been worked out on Fibonacci generators, so the quality of the generators rests mostly on the statistics of the generated output, and randomness tests.

However, this is simple enough to do by hand, if nothing else than to impress your friends.

Financially Supporting Open Crypto

In April 2014, Heartbleed shook the Internet. OpenSSL had introduced a feature called "TLS Heartbeats" Heartbeats allow for a client-encrypted session to remain open between the client and the server, without the need to renegotiate a new connection. In theory, the feature is sound. Heartbeats should minimize load on busy servers, and improve responsiveness on the client. However, due to a simple oversight in the code, buffers could be over-read, allowing the client to request much more data from the server's memory than needed. As a result, usernames and passwords cached in the server's memory could be leaked to the client.

This was a nasty bug, and it underscored how under-staffed and under-funded the OpenSSL development team is. OpenSSL is the de facto standard in securing data in motion for the Internet. It protects your web connections when visiting your bank's website, and it protects your email communication between your email client and the upstream mail server.

Ars Technica started off an article about tech giants finally agreeing to fund the OpenSSL development. Quote:

The open source cryptographic software library secures hundreds of thousands of Web servers and many products sold by multi-billion-dollar companies, but it operates on a shoestring budget. OpenSSL Software Foundation President Steve Marquess wrote in a blog post last week that OpenSSL typically receives about $2,000 in donations a year and has just one employee who works full time on the open source code.

If that isn't bad enough, Werner Koch, the sole developer and maintainer of the encryption software "GnuPG" is in much the same position as Steve Marquess. ProPublica put up a post about the very sobering financial situation of GnuPG. Quote:

The man who built the free email encryption software used by whistleblower Edward Snowden, as well as hundreds of thousands of journalists, dissidents and security-minded people around the world, is running out of money to keep his project alive.

Werner Koch wrote the software, known as Gnu Privacy Guard, in 1997, and since then has been almost single-handedly keeping it alive with patches and updates from his home in Erkrath, Germany. Now 53, he is running out of money and patience with being underfunded.

To understand just how critical this piece of software is to the Internet and the community at large, OpenPGP (the specification upon which GnuPG is built) is used by software developers around the world to prove the integrity of their software, when downloading it from their website. It's used by operating system vendors, such as Microsoft, Apple, Google, and GNU/Linux to provide package integrity when installing "apps" on your computer or mobile device. People and corporations have used it internally for data at rest as well, such as encrypting backups before sending them offsite.

Thankfully, after ProPublica published their article, Werner Koch, father and husband, got the donation funding he needed to continue focusing on it full time. Thanks to Facebook and Stripe, he has $100,000 of annual sponsored donations to help keep the development of GnuPG pressing forward.

Why is it that the two most fundamental cryptographic tools in our community are so under developed, under funded, and under staffed? I can understand that cryptography is hard. There is a reason why people get doctorate degrees in mathematics and computer science to understand this stuff. But with such critical pieces of infrastructure protection, you would think it would be getting much more attention than it is.

A good rule of thumb for cryptography, is if you want to protect your data in transit, use OpenSSL; if you want to protect your data at rest, use GnuPG. Let's hope that these two projects get the attention and funding they need to continue well into the future for years to come.

If you want to help donate to these two projects, you can donate to GnuPG here and to OpenSSL here. Alternatively, there is a Flattr donation page for GnuPG where you can setup recurring donations here.

Reasonable SSH Security For OpenSSH 6.0 Or Later

As many of you have probably seen, Stribik András wrote a post titled Secure Secure Shell. It's made the wide rounds across the Internet, and has seen a good, positive discussion about OpenSSH security. It's got people thinking about their personal SSH keys, as well as the differences between ECC and RSA, why the /etc/ssh/moduli file matters, and other things. Because of that post, many people who use SSH are increasing their security when they get online.

However, the post does one disservice- it requires OpenSSH 6.5 or later. While this is good, and people should be running the latest stable release, there are many, many older versions of OpenSSH out there, that are still supported by the distro, such as Debian GNU/Linux 7.8, which ships OpenSSH 6.0. Most people will be using the release that ships with their distro.

As a side note, CentOS 5 ships OpenSSH 4.3, and CentOS 6 ships OpenSSH 5.3. Because these are very old releases, and CentOS is still providing support for them, you will need to check the man pages for OpenSSH, and see how your client and server configurations need to be adjusted. It won't be covered here.

So, with that in mind, let's look at OpenSSH 6.0, and see what it supports.

OpenSSH 6.0 Ciphers

The following is the default order for symmetric encryption ciphers:

  1. aes128-ctr
  2. aes192-ctr
  3. aes256-ctr
  4. arcfour256
  5. arcfour128
  6. aes128-cbc
  7. 3des-cbc
  8. blowfish-cbc
  9. cast128-cbc
  10. aes192-cbc
  11. aes256-cbc
  12. arcfour

CTR mode should be preferred over CBC mode, whenever possible. It can be executed in parallel, and it seems to be the "safer" choice over CBC although it's security margin over CBC is probably minimal. The internal mechanisms are more simplistic, which is why modes like EAX and GCM use CTR internally. With that said, CBC mode is not "unsafe", so there is no strong security argument to avoid it. However, modern and older OpenSSH implementations support CTR mode, so there really is no need for CBC.

The "arcfour" protocols are "alleged RC4", but adhere to the RC4 RFC. RC4 has been showing weaknesses lately. Cryptographers have been advising to move off of it, PCI vendors will fail scans with SSL implementations that support RC4, and OpenBSD 5.5 switched to a modified ChaCha20 for its internal CSPRNG. So, it's probably a good idea to move away from the arcfour ciphers, even if it may not be practically broken yet.

However, arcfour is really the only high performance cipher in the OpenSSH 6.0 suite, and is very handy when trying to transfer many gigabytes of data over the network, as AES will pin the CPU before flooding the pipe (unless of course you have hardware AES on board). So, I would recommend the arc4 ciphers as a last resort, and only enable them on private networks, where you need the throughput.

The cast128 cipher was an AES candidate, and is a Canadian standard. To my knowledge, it does not have any near practical security attacks. However, because only CBC mode is supported with CAST, and not CTR mode, and we're disabling CBC mode, it is not included in our final list.

3DES was designed to address the short 56-bit key sizes in DES, which was replaced later by AES. 3DES cascades DES three times, with three distinct 56-bit keys. 3DES also does not have any near practical security attacks, and it is believed to be secure. However, DES was designed with hardware in mind, and is slow, slow, slow in software. 3DES three times as much. It's horribly inefficient. As such, I would recommend disabling 3DES.

Blowfish was designed by Bruce Schneier as a replacement for DES. While Blowfish might still have a considerable security margin, Blowfish suffers from attacks from weak keys. As such, Blowfish implementations must be careful when selecting keys. Blowfish can be efficient in both hardware and software, but it's usually less efficient than AES. Further, Bruce himself recommends that people stop using Blowfish and move to its successor Twofish, or even Threefish. As such, because both stronger and more efficient algorithms exist, I would recommend disabling Blowfish. It really isn't offering anything to OpenSSH clients.

So, in my opinion, I would sort my OpenSSH 6.0 ciphers like so:

  1. aes256-ctr
  2. aes192-ctr
  3. aes128-ctr
  4. arcfour256
  5. arcfour128
  6. arcfour

OpenSSH 6.0 Key Exchange

The following is the default order for key exchange algorithms:

  1. ecdh-sha2-nistp256
  2. ecdh-sha2-nistp384
  3. ecdh-sha2-nistp521
  4. diffie-hellman-group-exchange-sha256
  5. diffie-hellman-group-exchange-sha1
  6. diffie-hellman-group14-sha1
  7. diffie-hellman-group1-sha1

The NIST curves are considered to be insecure. Not because it's some government agency tied with the NSA, but because the curves are not ECDLP rigid, and suffer from a lack of constant-time single-coordinate single-scalar multiplication, they aren't complete, and are distinguishable from uniform random strings. If you want to blame the NSA for rubber-stamping and backdooring the NIST ECC curves, fine. I'll stick with the crypto.

And, although the security margin gap is closing on SHA-1, some commercial SSH providers, such as Github may still require it for your SSH client. So, in your client config, I would put the preference on SHA-256 first, followed by SHA-1. On your own personal servers, you can disable the SHA-1 support completely.

Thus, I would recommend the following key exchange order:

  1. diffie-hellman-group-exchange-sha256
  2. diffie-hellman-group-exchange-sha1
  3. diffie-hellman-group14-sha1
  4. diffie-hellman-group1-sha1

OpenSSH 6.0 Message Authentication Codes

The following is the default order for message authentication codes:

  1. hmac-md5
  2. hmac-sha1
  4. hmac-ripemd160
  5. hmac-sha1-96
  6. hmac-md5-96
  7. hmac-sha2-256
  8. hmac-sha256-96
  9. hmac-sha2-512
  10. hmac-sha2-512-96

Things get interesting here, because with HMAC algorithms, successful attacks require breaking the preimage resistance on the cryptographic hash. This requires a complexity of 2^n, where "n" is the output digest size in bits. MD5 is 128-bits, and SHA-1 is 160-bits. All currently known attacks on MD5 and SHA-1 are collision attacks, and not preimage attacks. Collision attacks require a complexity of only 2^(n/2). Thus, for MD5, collision attacks require a complexity of only 64-bits at worst, and SHA-1 requires 80-bits. However, as we know now, MD5 collision resistance is fully broken in practical time with practical hardware. SHA-1 still remains secure, although its collision resistance has been weakened to 61-65-bits. This is almost practical.

Regardless, the HMAC-MD5 and HMAC-SHA1 remain secure, with wide security margins, due to their preimage resistance. The only concern, however, is that in order to succesfully break the preimage resistance of a cryptographic hash function, it requires first breaking its collision resistance. Because MD5 is broken in this regard, and SHA-1 is almost broken, it is advised to move away from any protocol that relies on MD5 or SHA-1. As such, even though HMAC-MD5 and HMAC-SHA1 remain very secure today, it would be best to disable their support. Interestingly enough, even though RIPEMD-160 has the same digest output space as SHA-1, it has no known collision weaknesses, and remains secure today, almost 20 years since its introduction.

Due to the almost practical collision attacks on SHA-1 with a a complexity of 61-65 bits, UMAC-64 probably does not have a wide enough security margin. As such, it should probably be disabled.

I would recommend the following order for your MACs:

  1. hmac-sha2-512
  2. hmac-sha2-256
  3. hmac-ripemd160

OpenSSH 6.0 Configuration

Okay. Now that we've everything ironed out in hardening our OpenSSH 6.0 connections, let's see how this would look in the client and on the server. For both the client config and the server config, it should support algorithms for both OpenSSH 6.0 and 6.7.

For an OpenSSH 6.0 client, I would recommend this config:

# OpenSSH 6.0 client config
Host *
    Ciphers aes256-ctr,aes192-ctr,aes128-ctr,arcfour256,arcfour128,arcfour
    KexAlgorithms diffie-hellman-group-exchange-sha256,diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1
    MACs hmac-sha2-512,hmac-sha2-256,hmac-ripemd160

For an OpenSSH 6.0 server, I would recommend this config:

# OpenSSH 6.0 server config
Ciphers aes256-ctr,aes192-ctr,aes128-ctr,arcfour256,arcfour128,arcfour
KexAlgorithms diffie-hellman-group-exchange-sha256
MACs hmac-sha2-512,hmac-sha2-256,hmac-ripemd160

Going back now to Stribik András' post, here is what your configurations would look like for OpenSSH 6.7:

For an OpenSSH 6.7 client, I would recommend this config. Further, ChaCha20-Poly1305 is a high performance cipher, similar to RC4. So we should prefer it as our first cipher, with AES following, and finally disabling RC4:

# OpenSSH 6.7 client config
Host *

For an OpenSSH 6.7 server, I would recommend this config (also disabling SHA-1 from the key exchanges):

# OpenSSH 6.7 server config


It's important that you pay attention to the versions of the clients and servers that you are using, so you can accurately set your configuration. In this case, we looked at what would be necessary to support OpenSSH versions 6.0 and 6.7. There may be slight differences in versions between those two, and you'll need to make the necessary adjustments.

Verifying Keybase Identities

When using Keybase, occasionally, people will track your identity. This has cryptographic value. Your identity on Keybase is based on what you do online and how long you have done it. As people track you, they cryptographically sign your Keybase identity. This creates a snapshot in time that states you've taken the precautions to verify the identity, by checking the digital signature of each of their online proofs. This snapshot is frozen in time, and as more and more people track your identity, the stronger the statement of the validity of that identity. In other words, Keybase compliments the PGP Web of Trust, without actually replacing key signing parties, or actually signing PGP keys.

In this post, I want to discuss what it takes to verify signatures of Keybase identity proofs, so you can verify that Keybase isn't doing anything sneaky the data. In this post, I am going to verify the identity proofs of a friend of mine, Joshua Galvez as an example of how to verify each identity proof out-of-band (not using the Keybase client software).

First, all identity proofs are stored in JSON, which is a standardized format. The JSON object is cleanly formatted for easy readability, so you can examine what has been signed, and exactly what you are verifying. Nothing should be hidden up Keybase's sleeves. To start, I am going to navigate to Josh's Keybase identity page. I see that he has proved he owns a Twitter account, a Github account, a reddit account, and a personal website, all with his personal OpenPGP key.

To verify the proofs, I need to get a physical copy of the statement. Again, I am going to do this all out-of-band, away from the Keybase client software. As such, I'll copy and paste each statement proof into a text editor, and save it to disk, as well as each PGP signature. I'll do this with his Twitter account as an example.

Because of the brevity of Twitter, a full JSON object with a PGP signature can't be sent. So, Keybase keeps this proof on their server, with a link in the tweet pointing to the proof. So, we'll need to get it there. The link in his tweet points to There is a "Show the proof" link on the page, which gives me all the necessary data for verifying his identity. All I need is his JSON object and his PGP signature. I need to combine them in a single file, and save it to disk. As such, my file will look like this:

   "body": {
      "client": {
         "name": " node.js client",
         "version": "0.7.3"
      "key": {
         "fingerprint": "12c5e8619f36b0bb86b5be9aea1f03e20cf2fdbd",
         "host": "",
         "key_id": "EA1F03E20CF2FDBD",
         "uid": "2b26e905f5b23528d91662374e840d00",
         "username": "zevlag"
      "service": {
         "name": "twitter",
         "username": "zevlag"
      "type": "web_service_binding",
      "version": 1
   "ctime": 1416507777,
   "expire_in": 157680000,
   "prev": null,
   "seqno": 1,
   "tag": "signature"
Version: GnuPG/MacGPG2 v2.0.22 (Darwin)
Comment: GPGTools -


I'll save this to disk as /tmp/zevlag-twitter.txt

Now, I just need Josh's public PGP key imported from a key server. I can, and should use Keybase here. Instead of using the MIT PGP key server, and running the risk of getting the wrong key, I can be reasonably confident I will get the correct key from Keybase. The raw public key can be accessed by appending "key.asc" at the end of their identity URL. So, in this case So, I'll grab it via the shell:

$ wget -O - 2> /dev/random | gpg --import -

Now that I have Josh's public key imported into my GPG public key ring, I am read to verify Josh's Twitter proof of identity:

$ gpg --verify /tmp/zevlag-twitter.txt
gpg: Signature made Thu 20 Nov 2014 11:23:23 AM MST using RSA key ID B7691E80
gpg: Good signature from "Joshua Galvez <>"
gpg:                 aka "Joshua Galvez (Work - Emery Telcom) <>"
gpg:                 aka " <>"
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: 12C5 E861 9F36 B0BB 86B5  BE9A EA1F 03E2 0CF2 FDBD
     Subkey fingerprint: DC35 E3CF 1179 41A9 7D72  BC9A 7B6C D794 B769 1E80

At this point, I can confirm that the owner of the private key for 0xEA1F03E20CF2FDBD cryptographically signed a JSON object for Twitter. Further, that individual has access to the Twitter account, so the signature can be posted. After verifying the other accounts, I can be reasonably confident that the individual is who they claim- Josh Galvez. Otherwise, an attacker has successfully compromised all of Josh Galvez's online accounts, as well as his OpenPGP key (or forged a new one), and either compromised his Keybase account, or created one masquerading as him. The former seems more likely than the latter. Further, because I have previously met with and engaged online with Josh, I have no doubt that this is indeed Josh Galvez, and 0xEA1F03E20CF2FDBD is indeed his public key.

So, I can now track Josh through Keybase, which means me cryptographically signing his Keybase identity, and creating a snapshot in time that says "I am reasonably sure this is Josh Galvez, these accounts are part of his online presence, and 0xEA1F03E20CF2FDBD is his OpenPGP key. Staying out of band from the Keybase client software, I can do this entirely with curl(1) and gpg(1).

Navigating to his Keybase identity, I'll click the "Track zevlag" button. A pop-up displays with the following options:

  • in the browser
  • command line with keybase
  • command line with [bash + GPG + cURL]

I have not integrated an encrypted copy of my private key with Keybase, so tracking Josh in the browser is unavailable to me. Further, I wish to do this out-of-band from Keybase anyway, so I'll select "command line with [bash + GPG + cURL]" and click "Continue". This displays that I need to copy and paste the following content into my shell:

echo '{"body": (... large JSON object snipped ....) }' | \
gpg -u 'e0413539273a6534a3e1925922eee0488086060f' -a --sign | \
perl -e '$_ = join("", <>); s/([^\w\.@ -])/sprintf("%%%2.2x",ord($1))/eg; s/ /+/g; print("sig=", $_)' | \
curl -d @- \
  -d type=track \
  -d session=lgHZIDg3ZWNjY2NiNTRiMTBiNThjOTQ2NDJhODA3MzM2NjAwzlSh4WnOAeEzgNkgZjZmNWVmZDg4YzcwZDI2NDNlZGY2ZWYyYTc3M2IyMDLEIM0QqHGrtfga4a%2Bnz7soXFHqFbbiio7PaVGjh7DfyyPG \
  -d csrf_token=lgHZIDg3ZWNjY2NiNTRiMTBiNThjOTQ2NDJhODA3MzM2NjAwzlSkLg7OAAFRgMDEIA8egS4XVUzH%2BkPY8pMJbmMFiN3%2BAdZEdTm7Buvm551L \
  -d plain_out=1 \
  -d uid=2b26e905f5b23528d91662374e840d00

After entering that into my shell, and hitting enter, I am presented with typing in my passphrase for my private key, which in turn signs the object, and uses the Keybase API to post the result. I can then reload my profile, and see that I am now tracking Josh with Keybase. This means that at this point in time, I have made a cryptographic statement regarding the key ownership and identity of Joshua Galvez. Of course, I can revoke that statement at any time, if for any reason I believe his account has become compromised, he himself has become untrustworthy, or for other reasons.