Image of the glider from the Game of Life by John Conway
Skip to content

SHA512crypt Versus Bcrypt

On the Internet, mostly in crypto circles, you'll see something like the following in a comment, forum post, on a mailing list, other otherwise:

Do not use fast hashes to store passwords on disk. Use bcrypt.

In most cases, however, the understanding of why to use bcrypt isn't entirely clear. You'll hear the standard answer "It's slow", without a real understand as to "how slow?" nor as to "why is it slow?". In this post, I want to explain why bcrypt is slow, some misconceptions about using fast hashes, and where the real strength of bcrypt lies (hint- it's not speed). Finally, I'll close with an alternative that many are starting to talk about as a possible replacement to bcrypt.

First, when people are talking about using bcrypt for password hashing, they are referring to the bcrypt cryptographic key derivation function, designed by Niels Provos and David Mazières. Bcrypt is designed to be intentionally slow and expensive. It was designed specifically with password storage in mind. The motivation is clear- if a password database of any kind is leaked to the Internet, it should be cost prohibitive for password crackers to make any sort of progress recovering the unknown passwords from the known hashes.

bcrypt algorithm
How does bcrypt work though? What is the algorithm? According to the paper, the core bcrypt function in pseudocode is as follows:

bcrypt(cost, salt, input)
    state = EksBlowfishSetup(cost, salt, input)
    ctext = "OrpheanBeholderScryDoubt" //three 64-bit blocks
    repeat (64)
        ctext = EncryptECB(state, ctext) //encrypt using standard Blowfish in ECB mode
    return Concatenate(cost, salt, ctext)

The first function, "EksBlowfishSetup(cost, salt, input)" in the algorithm is defined as follows:

EksBlowfishSetup(cost, salt, key)
    state = InitState()
    state = ExpandKey(state, salt, key)
    repeat (2^cost) // exponential cost by powers of 2
        state = ExpandKey(state, 0, key)
        state = ExpandKey(state, 0, salt)
    return state

In the "EksBlowfishSetup", you'll notice the "repeat" step uses a binary exponential parameter. As the cost is increased, the time it will take to finish the algorithm will take exponentially longer. Bcrypt was designed with this cost parameter to adjust for Moore's law. As computing strength continues to improve, bcrypt should be flexible in its design to adjust for those advancements. This is why the cost parameter is baked into bcrypt, and why people call it "slow".

Finally, you'll notice the "ExpandKey(state, salt, key)" function in the algorithm. It is defined as follows:

ExpandKey(state, salt, key)
    for(n = 1..18)
        P_n  key[32(n-1)..32n-1] XOR P_n //treat the key as cyclic
    ctext = Encrypt(salt[0..63])
    P_1 = ctext[0..31]
    P_2 = ctext[32..63]
    for(n = 2..9)
        ctext = Encrypt(ctext XOR salt[64(n-1)..64n-1]) //encrypt using the current key schedule and treat the salt as cyclic
        P_2n-1) = ctext[0..31]
        P_2n = ctext[32..63]
    for(i = 1..4)
        for(n = 0..127)
            ctext = Encrypt(ctext XOR salt[64(n-1)..64n-1]) //as above
            S_i[2n] = ctext[0..31]
            S_i[2n+1] = ctext[32..63]
    return state

Because bcrypt was based on Blowfish, the "ExpandKey(state, 0, key)" function used in the "EksBlowfishSetup" function is the same as regular Blowfish key schedule since all XORs with the all-zero salt value are ineffectual. The bcrypt "ExpandKey(state, 0, salt)" function is similar, but uses the salt as a 128-bit key.

Also, to clarify, a 128-bit salt is also baked into the algorithm, as you can see. This is to prevent the building of lookup tables for bcrypt, such as rainbow tables. Salts do not slow down crackers, and it's assumed that salts will be leaked with the database. All salts provide is the protection against using a hash lookup table to find the originating plaintext. Because salts are baked into bcrypt, bcrypt lookup tables will never exist. This forces password crackers to brute force the hash.

Understanding password security
There are a few key security elements related to passwords that you must understand. They are the following:

  1. The unpredictability measurement, aka "entropy", of the password provided by the user.
  2. The speed at which brute forcing passwords can commence.
  3. The cryptographic strength of the function itself.

I ordered these for a specific reason- the most likely "weak link" in the chain of password security is password the user provides. History of leaked password databases have shown us that. If users understood real strength behind passwords, they would understand the basic concepts of entropy, even if they weren't familiar with the term itself. If entropy levels were high in all user's passwords, no matter what, then the success of recovering passwords from hashes via brute force would be ineffective. But, 70-80%, and better, of password databases are recovered, because of this simple concept not getting applied. The speed at which password crackers brute forced their way through the hashes in the database would no longer matter, because no amount of practical computing power would be able to work fast enough within the death of the Universe, to recover the user's password.

Sadly, this just isn't the case. People suck as picking passwords.

Key stretching
So, we need to compensate for users picking bad passwords, and bcrypt makes a great leap in this regard. Because of the cost parameter which is part of the algorithm, we can adjust the cost to make password hashing intentionally slow. And, as computing power increases, the cost parameter can continue to be adjusted to compensate. This is what most people understand, when they claim that "bcrypt is slow".

The argument is that cryptographic hashes are designed to be fast, fast, fast. And they're right. Cryptographic hash functions are designed to provide data integrity regardless of the size of the input. If I have a 4.7 GB CD image, I should be able to calculate its digest in reasonable time, so when I transfer the image to another computer, I can recalculate the digest, and compare that the two digests match, in reasonable time.

This would seem like a Bad Thing for password storage, because passwords are short (much shorter than 4.7 GB at least), so password crackers would be able to guess millions or billions of passwords per second using a fast cryptographic hash. You're right, for the most part. Ars Technica ran a story on password cracking with a 25-GPU cluster. It achieves a speed of 350 billion NTLM passwords per second, which means every conceivable Windows NTLM password can be recovered in less than 6 hours using this behemoth. It can work MD5 at 180 billion per second, or 63 billion per second with SHA-1.

At these speeds, the argument against using fast cryptographic hashes to store passwords is sounding pretty strong. Except, that Ars Technica article, and most bcrypt fanboys seem to overlook one thing- key stretching. Just because cryptographic hashes are fast, doesn't mean we can't intentionally slow them down. This is key stretching.

Key stretching is the idea that you reuse internal state for calculating a new key. For cryptographic hash functions, this is "iterations" or "rotations". The idea is taking the cryptographic digest of the input, and using this digest as the input for another hashing round. In pseudocode, you could think of it like this:

salt = random() // programmatically determine a salt randomly
password = input() // get the password from the user
key = ''
cost = 5000

for ROUND in 1 to cost: do
    digest = SHA512(salt, password,  key)
    key = digest

If our 25-GPU cluster could work through 50 billion SHA-512 cryptographic hashes per second, by forcing 5,000 SHA-512 calculations before getting to the desired hash, our 25-GPU cluster can now only work through 10 million SHA-512 hashes per second. As the iterative count is increased, the time it takes to calculate the resulting digest increases. As such, we have created a "sha512crypt" that has a similar cost parameter as bcrypt. Now the question remains- does it hold up?

Practical examples
To see if this "key stretching" idea holds up, I wrote two Python scripts- one using SHA-512, and the other using bcrypt. In both cases, I increase the cost parameter from a reasonable starting point, and increased it well beyond a reasonable expectation.

Here is my Python code for "":

import hashlib
password = b'password'
cost = 5000
key = ''
m = hashlib.sha512()
for i in xrange(cost):
    key = m.digest()

And here is my Python code for "":

import bcrypt
cost = 6
password = b'password'
salt = bcrypt.hashpw(password,bcrypt.gensalt(cost))
hash = bcrypt.hashpw(password, salt)

In both cases, I incremented the cost, then timed re-running the script. Of course, Python is an interpreted language, so the absolute times would be much lower if this were implemented in C, or assembly. Further, this was done on my aging T61 laptop. Running it on my 8-core i7 workstation with triple-channel DDR3 would show improved times. It not the times that are critical. What is critical is seeing the exponential back-off as the cost is increased.

Here is a table detailing my findings. Notice that the bcrypt cost increments by a single digit. Because it's binary exponential back-off, the times increase by a power of 2 at each iteration. I also adjusted the sha512crypt cost a little to more closely match the bcrypt timings, even though it's not a strict doubling of each cost value.

bcrypt sha512crypt
cost time iterations time
6 0.032 5,000 0.036
7 0.045 10,000 0.047
8 0.064 20,000 0.064
9 0.114 40,000 0.105
10 0.209 80,000 0.191
11 0.384 160,000 0.368
12 0.745 320,000 0.676
13 1.451 640,000 1.346
14 2.899 1,280,000 2.696
15 5.807 2,560,000 5.347
16 11.497 5,500,000 11.322
17 22.948 11,000,000 22.546
18 45.839 22,000,000 45.252
19 1:31.95 44,000,000 1:30.14
20 3:07.27 88,000,000 3:07.52

In the Python bcrypt implementation, the default cost is "10". For most modern GNU/Linux operating systems, when storing the user password in /etc/shadow with sha512crypt (yes, I didn't come up with the name), the default cost is 5,000 iterations. In both these cases, the cost can be adjusted. In the case of the Python bcrypt module, it's just passing the function with a numerical argument. In the case of GNU/Linux, it's editing PAM by adding "rounds=" to a config file.

As such, sha512crypt can be just as slow as bcrypt. Remember, we are trying to adjust for increased computing power that password crackers will have access to. In both cases, bcrypt and sha512crypt address that requirement.

Bcrypt's additional strength
So, if sha512crypt can operate with a cost parameter similar to bcrypt, and can provide that exponential back-off that we are looking for to slow down password brute force searching, then what's the point of bcrypt? Are there any advantages to running it? It turns out, there is, and I suspect this is a consequence of the design, and not something that was intentionally added.

What we would like is to prevent password crackers from using non-PC hardware on attacking the password database. SHA-2 functions, such as SHA-512, work very well on GPUs. SHA-2 functions work well on specialized hardware such as ASICs and FPGAs. As such, while we could make things slow for CPU or GPU crackers, those password crackers with specialized hardware would still have an upper hand on attacking the password database. Further, by addressing GPU cracking, and making it intentionally slow there, we make like more difficult for CPUs, which means hurting the honest user when trying to login to your web application. In other words, if I adjusted sha512crypt for GPU crackers, such that only 1,000 passwords per second could be achievable on a GPU, that might be a full second, or more, for the user logging into your CPU server. This may or may not be desirable.

So, how does bcrypt attack this problem? Part of the algorithm requires a lookup table stored in RAM that is constantly modified during execution. This turns out to work out very well on a standard PC where the CPU has exclusive access to RAM. This turns out to work out fairly poorly on a GPU, where the cores share the on-board memory, and each core must compete on the bus for access to those registers. As a result, any additional cracking speed is greatly minimized on a GPU when compared to the CPU. In other words, GPU password cracking with bcrypt isn't entirely effective either.

For SHA-2 functions, like SHA-512 however, this is not the case. SHA-2 functions use only 32-bit logic and arithmetic operations, which GPUs excel at. By using a GPU over a CPU for our sha512crypt function, a password cracker can get a couple to many orders of magnitude of additional cracking power.

So, the reason to use bcrypt isn't because "it's slow". The reason to use bcrypt is because "it's ineffective on GPUs".

A better alternative- scrypt
Unfortunately for bcrypt, however, due to its low memory requirement, bcrypt can be implemented in a field programmable gate array (FPGA) or custom ASICs. Understand that bcrypt was designed in 1999, when such specialized hardware had low gate counts, and was few and far bewteen. 15 years later, times have drastically changed. Bitcoin ASICS with SHA-256 FPGAs on board are common place. Hardware AES is common in CPUs and embedded systems. The fact of the matter is, these FPGAs, with their onboard, and fast RAM are well suited to bring bcrypt password cracking well into "fast" territory.

An alternative would be a solution that not only requires address registers to be constantly modified during algorithm execution, but to also exponentially bloat the memory requirement for the increased cost. scrypt addresses this shortcoming in bcrypt. Scrypt is another password key derivation function that was initially designed in 2009 by Colin Percival as part of the Tarsnap online backup service.

Scrypt has all of the advantages that bcrypt provides- baked in salt, exponential cost parameter, and ineffectiveness on GPUs, while also adding an exponential RAM requiremnt per the cost. Because of this RAM requirement, it is no longer cost efficient to build FPGAs with the necessary RAM.

Security, standards, and implementations
Scrypt is only 5 years young. This gives bcrypt a good 10 year head start. In terms of security, this is preferred. We want cryptographic functions that have withstood the test of time with cryptographers constantly attacking and analyzing their functions, primitives, and implementations. The longer it remains "unbroken", the more "secure" we deem the functions to be.

Bcrypt continues to be attacked and analyzed, and is showing no serious sign of weakness, 15 years later. This is good for the security of bcrypt. Scrypt however has seen less scrutiny, mostly due to its young age. However, it has been 5 years, and like bcrypt, no serious signs of weakness have been shown. By comparison, the SHA-2 family of functions was created in 2001, and has been scrutinized much more than bcrypt and scrypt combined, and also is not showing any serious signs of weakness. So, from a security standpoint, the SHA-2, bcrypt, and scrypt functions all seem to be fairly secure.

When looking at governing body standards, NIST has no paper on bcrypt or scrypt. They do recommend using PBKDF2 (another key derivation function (which I haven't explained here, but love)) for password storage, and NIST has standardized on SHA-2 for data integrity. Personally, while I like the ideas of bcrypt and scrypt, I would recommend sticking with the NIST recommendations with high iteration counts, as shown above. Until we see more standards boy interest in bcrypt and scrypt, IMO, you are taking a risk using them for password storage (less so for bcrypt than scrypt at least).

Finally, because of the newness of scrypt, there are less tools for its use in programming languages than bcrypt, and even more so for SHA-2. Further, most programming languages don't include either bcrypt or scrypt in their "standard library" of modules or functions, while SHA-2 is more generally found. And for those implementations, some 3rd party libraries are more trust worthy than others. Because you're dealing with password storage, it's critical you get this right.

While I love the algorithms behind bcrypt and scrypt, I've always advocated for using high iterative counts on the SHA-2 or PBKDF2 functions. Even further is the advocating of teaching people how to understand the concepts behind their password entropy, and improving their own online security. That is the weakest link, and needs the most work, IMO.

So, if you ask me, I'll tell you to use either PBKDF2 or SHA-2 with high iterative counts. Or, if you must absolutely use bcrypt, then I'll recommend that you use scrypt instead.

Super Size The Strength Of Your OpenSSH Private Keys

In a previous post, about 18 months ago, I blogged about how you can increase the strength of your OpenSSH private keys by using openssl(1) to convert them to PKCS#8 format. However, as of OpenSSH verison 6.5, there is a new private key format for private keys, as well as a new key type. The new key type is ed25519. Without going into the details of the strengths of ed25519 over RSA, I do want to identify a new encryption method for your private keys.

In previous versions of OpenSSH, if you provided a passphrase to encrypt your private key, it was converted into a cipher key by first hashing it with MD5, then encrypting your private key. Unfortunately in this case, as usually the problem with all hashed password storage, MD5 is fast, fast, fast. While MD5 is a cryptographic one-way hashing function, it can fall victim to rainbow table attacks, as well as just plain old brute forcing. With a few GPUs, and the right software algorithm, it's not unheard of to try billions of passwords per second to try and achieve the correct hash, or in this case a cipher key.

Key derivation functions (KDFs) however, can be resource intensive, and slow. One in particular is bcrypt, which is very similar to the bcrypt one-way hashing function. With OpenSSH 6.5, when generating ed25519 keys, the bcrypt pbkdf is the default function for creating that cipher key based on your passphrase. To further protect you from brute force searching on your passphrase, ssh-keygen(1) will apply 16 rounds to the bcrypt pbkdf before creating the cipher key which is used to encrypt the private key on disk. On my ThinkPad T61, this takes approximately 1/3 of a second to complete all 16 rounds, or about 50 per second. This is a far cry from the millions I know my T61 can do with MD5.

However, this isn't even the bread and butter of the post: You can convert your existing keys to the new format with OpenSSH 6.5. This means your old DSA and RSA keys, and even the newer ECDSA keys, can all be converted to take advantage of the new format.

Further, you don't have to take the default 16 rounds of encrypting your key. Instead, you can increase that if you want to be a bit more paranoid. Suppose I wish to apply 100 rounds instead of the default 16- a factor of over 6x. To do this, for each of your private keys, run the following:

$ ssh-keygen -o -p -a 64 -f id_rsa
Enter old passphrase: 
Key has comment 'rsa w/o comment'
Enter new passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved with the new passphrase.

At this point, it will take approximately 2 seconds on my T61 to complete the rounds, and encrypt the key. This can be verified whet creating an SSH agent, and adding my key to the agent:

$ eval $(ssh-agent)
Agent pid 17202
$ ssh-add
Enter passphrase for /home/aaron/.ssh/id_rsa: 
Identity added: /home/aaron/.ssh/id_rsa (/home/aaron/.ssh/id_rsa)

When adding my passphrase, it takes a full 2 seconds before returning to a shell on the remote server. Of course, feel free to increase the rounds count. 1000 rounds would take me a full 20 seconds. Probably not sufficient for day-to-day use while at work, but could be applicable in other cases.

When you look at your private keys, with the old version, the header would look something like this:

Proc-Type: 4,ENCRYPTED
DEK-Info: AES-128-CBC,DF7C541751D59241F15DA424506137CE

If you converted your key to PKCS#8 with openssl(1), then your headers would look something like this:

(base64 output)

However, with the new OpenSSH key format, encrypted keys now look like:

(base64 output)

With bcrypt as the new encrypted storage format, and the ability to adjust the number of rounds, as well as convert older keys, this is a big win for security. Well done OpenSSH team!

UPDATE: It should be noted that when using this new on-disk encrypted format, your OpenSSH private key will no longer be compatible with openssl(1), as previously, the private key was stored in PEM format. Further, using the "ed25519" key type means using the new format automatically, as openssl(1) does not support the ed25519 algorithm.

Use /dev/random Instead Of /dev/null

While writing a shell script the other day, I was redirecting some output to /dev/null, as normal, when something dawned on me. Why don't I redirect my output to /dev/random instead? After all, both Linux random devices are writable by everyone on the system:

$ ls -l /dev/*random
crw-rw-rw- 1 root root 1, 8 Nov 13 15:14 /dev/random
crw-rw-rw- 1 root root 1, 9 Nov 13 15:14 /dev/urandom

Knowing what I know about the Linux cryptographic pseudorandom number generator (CSPRNG), I know that any bits put into the CSPRNG input pool are hashed with the SHA1 cryptographic hash function 512-bits at a time. This includes any data you redirect to it from the shell, as well as output from itself. When data is fed into the CSPRNG input pool, the RNG is reseeded.

To understand this concept of seeding an RNG, let's assume for a moment that the only source of input for the RNG is its own output. If this were the case, we would only need a starting value to "seed" the RNG, then let it run by hashing its own digests. In this scenario, each digest is chosen deterministically, and if we know the input that seeded the RNG, we can predict all of its future outputs.

Think of this scenario like a progress bar. For the SHA1 cryptographic hash, there are 2^160 possible unique digests. Theoretically, our RNG should be able to work through all 2^160 digests only once before starting over, provided there is enough time to do so (we know now this isn't the case, which means SHA1 has been weakened in terms of collision attacks). However, when you change the input by providing something other than the next digest in the queue, you change the next starting point of the RNG. It's as though you've "skipped" to a non-sequential location in your progress bar.

Now, consider constantly reseeding your RNG. This is what your Linux system is actually doing. It's constantly processing timing events from disk IO, network packets, keyboard presses, mouse movements, etc. All these inputs get collected into an "entropy pool", which is then hashed with SHA1 512-bits at a time, as we mentioned earlier. This input changes the sequential ordering of the digest outputs, making the result unpredictable, and non-deterministic.

So, when working on the shell, by redirecting your output to /dev/random, you reseed the CSPRNG, meaning you have changed the digest output ordering to something different than what it would have been had you not redirected those bits. In fact, the more you send data to the CSRNG, the more you reseed it, forever altering the path it takes on its "progress bar".

Now, you may ask why not have some userspace PID running in the background that is always reseeding the CSPRNG? Sure, you can. In this case, I would recommend running Haveged on your system. Haveged will probe much more hardware events on the system than the default install will, and keep the entropy pool topped off at full. The CSPRNG will be constantly reseeded. However, for shell scripts, redirecting to /dev/random instead of /dev/null works.

My only concern with redirecting to /dev/random would be bandwidth concerns. Doing a simple and crude benchmark comparing /dev/null to /dev/random, I get the following on my workstation:

$ for I in {1..5}; do dd if=CentOS-6.5-x86_64-bin-DVD1.iso of=/dev/random; done
8726528+0 records in
8726528+0 records out
4467982336 bytes (4.5 GB) copied, 81.3842 s, 54.9 MB/s
8726528+0 records in
8726528+0 records out
4467982336 bytes (4.5 GB) copied, 76.4597 s, 58.4 MB/s
8726528+0 records in
8726528+0 records out
4467982336 bytes (4.5 GB) copied, 74.6036 s, 59.9 MB/s
8726528+0 records in
8726528+0 records out
4467982336 bytes (4.5 GB) copied, 75.4946 s, 59.2 MB/s
8726528+0 records in
8726528+0 records out
4467982336 bytes (4.5 GB) copied, 74.375 s, 60.1 MB/s
$ for I in {1..5}; do dd if=CentOS-6.5-x86_64-bin-DVD1.iso of=/dev/null; done  
8726528+0 records in
8726528+0 records out
4467982336 bytes (4.5 GB) copied, 59.325 s, 75.3 MB/s
8726528+0 records in
8726528+0 records out
4467982336 bytes (4.5 GB) copied, 56.5847 s, 79.0 MB/s
8726528+0 records in
8726528+0 records out
4467982336 bytes (4.5 GB) copied, 54.4541 s, 82.1 MB/s
8726528+0 records in
8726528+0 records out
4467982336 bytes (4.5 GB) copied, 56.0187 s, 79.8 MB/s
8726528+0 records in
8726528+0 records out
4467982336 bytes (4.5 GB) copied, 57.0039 s, 78.4 MB/s

Seems I get slightly better throughput with /dev/null, which isn't surprising. So, unless you know you need the throughput, I would recommend sending your data to /dev/random over /dev/null.

The Bitmessage Proof Of Work

I've been on the Bitmessage network roughly since it was released. Maybe only a month or two later. One thing that has had me intrigued, although I've never really paid attnetion to ut until now, is Bitmessage's proof-of-work puzzle.

A proof-of-work puzzle is a puzzle your computer solves to generally gain access to some resource. Usually, the intention is to ether prevent a denial of service attack, or to prevent spamming. In the case of Hashcash, that I have blogged about many times here, uses CPU stress to find a solution to a SHA1 digest. Its intent is to fight spam. The guided tour protocol is a proof-of-work puzzle to prevent denial of service attacks on a network or to a server. It forces every client, regardless of resources, to add network latencies by making roundtrip network connections to different servers, before reporting back with the solution to the requested resource. There are other proof-of-work puzzles that can require memory consumption before being granted access. The point is, a proof-of-work puzzle shows the requested resource, server, or destination that the client has spent valuable time solving a puzzle, proving they've done the necessary work. Mining Bitcoin works on this principle.

In the case of Bitmessage, as of protocol 3, the proof-of-work puzzle is a CPU stress test that is based on the size of the message being sent. The idea is to prevent spam from hitting inboxes. Unfortunately, the main client PyBitmessage is written in Python, so its ability to solve the proof-of-work is far slower than a compiled C/C++ implementation.

The proof-of-work is defined as follows:

target = \frac{\displaystyle 2^{64}}{\displaystyle nonceTrialsPerByte \times (payloadLength+payloadLengthExtraBytes+\frac{\displaystyle TTL \times (payloadLength+payloadLengthExtraBytes)}{\displaystyle 2^{16}})}


nonceTrialsPerByte = 1000 (default difficulty set by the Bitmessage owner during key generation, and defined as '1')
payloadLengthExtraBytes = 1000 (to add some extra weight to small messages)
payload = embeddedTime + encodedObjectVersion + encodedStreamNumber + encrypted
payloadLength = the length of payload, in bytes, + 8 (to account for the nonce which we will append later)
TTL = the number of seconds in between now and the object expiresTime (default is 28 days, plus or minus five minutes)

So, by default for most Bitmessage addresses, this equation can be roughly simplified as:

target = \frac{\displaystyle 2^{64}}{\displaystyle 1000 \times (payloadLength+1000+\frac{\displaystyle 2419200 \times (payloadLength+1000)}{\displaystyle 2^{16}})}

which can be further simplified to:

target = \frac{\displaystyle 2305843009213693952}{\displaystyle 606625 \times (payloadLength + 1000)}

With a small payload, say 100 bytes in size, our target becomes very large:

\frac{\displaystyle 2305843009213693952}{\displaystyle 606625 \times (100 + 1000)} \approx 442,309,956,621

The largest the payload can be is 256 kilobytes. Thus, our target becomes much smaller:

\frac{\displaystyle 2305843009213693952}{\displaystyle 606625 \times (262144 + 1000)} \approx 1,848,953,243

So, the larger the message, the smaller the target. Further, if you increase your difficulty on a new key from 1 to 2, then the "nonceTrialsPerByte" becomes 2000, instead of 1000. This drops the target to an even smaller number. This "target" value becomes the benchmark by which the difficulty of the proof of work is defined.

Now that we have our target value, we must set a trial value, and try to find a number deterministically that becomes smaller than our target. We do this with the SHA512() cryptographic hashing function.

First, we set "trialValue = 99,999,999,999,999,999,999", or about 226 million times larger than our target value could ever be. Then, we take the SHA512(payload), and set a counter to 0 (called a "nonce" in the protocol). Now we enter the following loop:

while trialValue > target:
    nonce = nonce + 1
    resultHash = SHA512(SHA512(nonce||initialHash)), where "||" is concatenation
    trialValue = the first 8 bytes of resultHash, converted to an integer

As you can see, the larger our target is, the easier it will be to find a trial value that is smaller than the target. However, the smaller the target value becomes, the more difficult it will become to find a smaller trial value. That target decreases in value as the difficulty or the message length is increased.

Suppose my target was "385,531,657,911", and the first 8 bytes of my SHA512 digest value in hexadecimal was "0a4b6ff992e295fa". The decimal value of this digest is "741,809,681,334,441,466". In our example, this number is larger than my target, so I'll need to increment my counter by 1, and try again. In fact, the largest my 8-byte digest value in hexadecimal can be, is "00000059c37a3eb6". In otherwords, this is a dynamic Hashcash system with a double SHA512 instead of a single SHA1.

Initially, I thought this proof-of-work system was a bit strange. I couldn't understand why the core developer(s) didn't choose a simple implementation of Hashcash. If the idea is to prevent spam from hitting the network, then Hashcash in and of itself will suffice. However, by placing the core difficulty on the length of the message, you discourage large binary abuse, such as trading images, music, or movies across the network. That's what Bittorrent is for. Since the max message size is now 256 KB as of protocol version 3, even more so. But, spammers could still easily send links to nefarious sites, which would be nothing more than 100 bytes or so, and calculate the proof-of-work easily. So, while large messages might be discouraged, spam could still be a problem. This is where increasing the difficulty upon key creation is useful.

So, I like that the proof-of-work system not only has small message spam fighting built into the system, I also like the fact that it discourages large message abuse as well. It seems, at least from what I've studied at this point, that the proof-of-work system for Bitmessage is well thought through, and fairly mature. However, until the default client is written in C/C++, I fear that the proof-of-work is targeted primarily at Python implementations, which are 30-50x slower than their C/C++ counterparts. So, we may still see message abuse on the network until this is addressed.

Using The Bitmessage Storage Service

While hanging out on the "privacy" channel on Bitmessage, someone sent the following:

"You have no files saved. For instructions please send a message to BM-2cUqBbiJhTCQsTeocfocNP5WCRcH28saPU with the subject 'help'."

This is actually pretty cool. No doubt should you call into question a faceless storage provider, but I thought I would play around with it. When you send 'help' to the address, you get the following information:

"You send commands to the service via messages, the message subject is used to tell the server what you want to do. You are authenticated via the address you send from, so be sure to keep it safe. Below are the possible commands:

LIST - Lists all the files you currently have saved to your account.
NEW [file_title] - Save a new file to your account. e.g "NEW my top secret file", put the contents of your file into the message body.
UPDATE [file_id] [new_file_name (optional) ] - Update an existing file saved to your account. The new_file_name is optional and is only used to rename your file. e.g "UPDATE a567f My even more top secret file"
REMOVE [file_id] - Removes a file saved to your account. e.g "REMOVE a567f".

When sending the LIST command, type some random blurb into the message. If you send multiple LIST commands to the server in a short period of time the Bitmessage client see's your requests as duplicate messages and ignores them. The random blurb ensures the server always hears you."

Intrigued, I started thinking about this. Due to the limitation of BM sending up to 256KB messages, your file can be no bigger than 256K, unless you setup a striped RAID array. Then you can chop up the file into multiple messages. However, the messages will likely be decrypted at the other end, and will very likely be stored unencrypted. So, they need to be encrypted, then converted into base64 before sending to the address.

Initially, I figured GnuPG would work for this. But then I thought that it's not a good fit, because I lose plausible deniability. So, instead, I'll use the 'dm-crypt' module with the cryptsetup(8).

First, I need my block devices. I'll setup some files with dd(1), then add them to loopback devices. Notice that I'm building the files from /dev/urandom. This is critical, so no metadata, including encryption boundaries, is leaked:

$ dd if=/dev/urandom of=~/file1 bs=256k count=1
$ dd if=/dev/urandom of=~/file2 bs=256k count=1
$ ls -l file?
-rw-rw-r-- 1 user user 262144 Nov 29 07:57 file1
-rw-rw-r-- 1 user user 262144 Nov 29 07:57 file2
$ sudo losetup /dev/loop1 ~/file1
$ sudo losetup /dev/loop2 ~/file2
$ losetup -l
/dev/loop1         0      0         0  0 /home/user/file1
/dev/loop2         0      0         0  0 /home/user/file2

Because I want plausible deniability, I'll use cryptsetup(8) first, _then_ create the RAID array. If I create the RAID array first, the two files will reveal information that they belong to an array. I don't want any metadata leaked at all.

$ sudo cryptsetup create aes-crypt-1 /dev/loop1
$ sudo cryptsetup create aes-crypt-2 /dev/loop2
$ ls -l /dev/mapper/aes-crypt-?
lrwxrwxrwx 1 root root 7 Nov 29 07:47 /dev/mapper/aes-crypt-1 -> ../dm-0
lrwxrwxrwx 1 root root 7 Nov 29 07:47 /dev/mapper/aes-crypt-2 -> ../dm-1

I can now create the RAID array, format it wxth ext2, and mount it. A couple notes: I'll first want to set the RAID chunk size to something low, otherwise I won't be able to put down a filesystem on the block device. So, I chose the minimum according to mdadm(8), which is 4KB. Then, when formatting with ext2, I'll only get a total of 64 inodes, which means a total of 64 files. I'm going to increase this to 256, so I can intentionally fragment the filesystem before putting down the data. I'll explain the reason for the intentional fragmentation in a second.

$ sudo mdadm --create /dev/md0 --level 0 --raid-devices 4 --chunk 4 /dev/mapper/aes-crypt-{1,2}
$ sudo mdadm --detail /dev/md0
        Version : 1.2
  Creation Time : Sat Nov 29 07:47:19 2014
     Raid Level : raid0
     Array Size : 464
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

    Update Time : Sat Nov 29 07:47:19 2014
          State : clean 
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

     Chunk Size : 4K

           Name : example:0  (local to host example)
           UUID : 2bca1bf9:3af4a5d1:1989bb34:9b46bb9c
         Events : 0

    Number   Major   Minor   RaidDevice State
       0     253        0        0      active sync   /dev/dm-0
       1     253        1        1      active sync   /dev/dm-1

Now the formatting. Notice I'm changing the number of inodes. Also, we don't need to set aside any space for the root user: It's occupying precious disk space.

$ sudo mkfs.ext2 -N 256 -m 0 /dev/md0
$ sudo mount /dev/md0 /mnt
$ ls /mnt
$ df -h /mnt
Filesystem      Size  Used Avail Use% Mounted on
/dev/md0        427K   15K  412K   4% /mnt
$ df -i /mnt
Filesystem     Inodes IUsed IFree IUse% Mounted on
/dev/md0          256    11   245    5% /mnt

Now before putting data into the filesystem, I'm going to fill it with small files of random data, then remove every n-th file to create enough space to put down my data. This fill force my data to be intentionally fragmented. The reason for this is to avoid a snapshot attack on the files I'll be storing remotely. If I did not fragment the data, then as I update the filesystem, only small incremental changes will take place. This will allow the attack to "subtract" a previous filesystem iteration from the current, allowing them to know where my stored data resides, as well as possibly figuring out what it contains. Because we're talking about a 512 KB filesystem here, even encrypted, disk I/O isn't a concern.

First, how big can each file be at a maximum? It appears to be 1721 bytes.

$ echo '(412*1024)/245' | bc -l # 245 available inodes, 412K available space

Next, create the files, each with 1721 bytes:

$ for i in {1..245}; do sudo dd if=/dev/urandom of=/mnt/file$i bs=1721 count=1 2> /dev/null; done

Unfortunately, the filesystem filled up before completion. As such, we have empty files. So, we'll find those and remove them:

$ find /mnt -empty -type f | wc -l
$ sudo find /mnt -empty -type f -delete
$ ls /mnt/file* | wc -l

Now, I'm ready to fragment the filesystem. I know that one file I want to copy is 8820 bytes in size. So I need to free up 6 non-contiguous files. according to my math, I need to free up every 34th file:

$ echo '8820/1721' | bc -l
$ echo '204/6' | bc -l
$ sudo rm /mnt/file{$((34*1)),$((34*2)),$((34*3)),$((34*4)),$((34*5)),$((34*6))}
$ df -h /mnt
Filesystem      Size  Used Avail Use% Mounted on
/dev/md0        427K  415K   12K  98% /mnt

I'm now ready to copy in my 8820-byte file:

$ sudo cp ~/secret-file.txt /mnt
$ df -h /mnt                      
Filesystem      Size  Used Avail Use% Mounted on
/dev/md0        427K  424K  3.0K 100% /mnt

Now I can tear everything down:

$ sudo umount /mnt
$ sudo mdadm --stop /dev/md0
$ sudo cryptsetup close aes-crypt-1
$ sudo cryptsetup close aes-crypt-2
$ sudo losetup -d /dev/loop1
$ sudo losetup -d /dev/loop2

Now I need to convert my two files to base64, copy, paste, and send the resulting output to BM-2cUqBbiJhTCQsTeocfocNP5WCRcH28saPU. The subject of the first message would be "NEW file1", while the subject of the second message would be "NEW file2". The body of each message would be the base64 output for file1 and file2.

$ base64 file1 | tr -d '\n'; echo
yAFtXPy7NgHl5Q0ueJiZgjOlYyrocWaxcvA6CjKF0rNd10sTfNvMCmQrL0cA79oO0 ...(snip) ...

$ base64 file2 | tr -d '\n'; echo
T3GTNvwiewXnnOHTjITNpukyLz4d8iv8wl/JP0YjY0v5s5euF1qv4WwMv9Ejl9AsNMm5NXoK/hFQK ...(snip)...

Of course, sending 2 messages with 256 KB in size will take some time to calculate the PoW, so be prepared for that (I don't know why the PoW is what it is. It should have been a double SHA256() like Bitcoin. Meh)

When you want to update your files, you wll need to build everything backup, except you won't "create" the RAID array, you'll "assemble" it, and you won't format your filesystem, obviously:

$ sudo losetup /dev/loop1 file1
$ sudo losetup /dev/loop2 file2
$ sudo cryptsetup create aes-crypt-1 /dev/loop1
$ sudo cryptsetup create aes-crypt-2 /dev/loop2
$ sudo mdadm /dev/md0 --assemble /dev/mapper/aes-crypt-1 /dev/mapper/aes-crypt-2
$ sudo mount /dev/md0 /mnt

You'll also need to move your file off the filesystem and rewrite all the random files, before copying the new data on. This is important, because we're trying to prevent a snapshot attack against our encrypted filesystem. So, on every commit to the BM storage service, as best as we can control, every bit on the filesystem needs to be changing.

As the help mentions, sending frequent "LIST" commands can cause your message to be ignored, unless the body of the message includes unique random data for each message sent. So, the "TOP TIP" of sending random data in the body of the message is a good idea. As such, I'll run the following for each "LIST", practically guaranteeing that I'll see a unique string, and the query will succeed:

$ dd if=/dev/urandom bs=2k count=1 2> /dev/null | rhash --sha3-512 -
4b5b19c9a8a61cf724cd710c6fd0c54edb46de2bfa55f2ec9a179a590b808993593d333f95dd6c607e9d366c385037cc0a600d262898e3b4c5be26479d3c962c  (stdin)

I'll copy and paste the result into the message body before sending.

I don't know that I'm ready to store sensitive data with it, but I do think it's pretty cool. The 256KB limit for the file size is pretty damning, so this isn't something to use for intense media storage, but instead, mostly for documents, such as password storage, browsing bookmarks, student grades, etc. It works, although it's definitely a niche product for a niche group. I wouldn't expect my wife to use this.

Where Cryptographic Hashing Algorithms Fail

What Is A Cryptographic Hashing Algorithm?
Cryptographic hashing algorithms are one-way functions that produce a message digest that represents a given input. Because the keyspace is so astromically large, it should be practically infeasible to find a different input that represents the same digest. The input is typically referred to as the message while the output is typically referred to as the digest.

Cryptographic hashes usually hold to four main principles:

  • Computing a digest should be easy for any given message (some would also say it should be fast).
  • When changing the message, the digest should exhibit the "avalanche effect".
  • It should be practically infeasible to find two messages that produce the same digest.
  • It should be practically infeasible to produce the message for a given digest.

Of all of the many cryptographic hashing algorithms in the world today, the following list are still considered to be secure:

  • RIPEMD-160, RIPEMD-256, RIPEMD-320
  • SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224, SHA-512/256
  • SHA3-224, SHA3-256, SHA3-384, SHA3-512, SHAKE128, SHAKE256
  • Skein-256, Skein-512
  • Whirlpool

SHA3 is a new standard adopted by the United States National Institude of Standards and Technology (NIST) which uses Keccak as the underlying function. Keccak uses the sponge contruction for creating its digests, which has some interesting properties that we'll look at in a second.

The Workhorse Of Cryptography
Many call cryptographic hashing algorithms the workhorse of cryptography, and rightfully so. These hashing functions are used for a great deal of things:

  • Password storage- Hash the password, and store to disk.
  • Pseudo-anonymous user- Hash the username, and pseudonymize.
  • Data deduplication- Hash the data blocks, and compare.
  • Message authentication- Hash the message, and send.

Cryptographic hashes are used in a lot of ways. PGP fingerprints are nothing more than SHA-1 hashes of a timestamp. Similarly, OpenSSH fingerprints are MD5 results. ZFS uses SHA-256 for data and metadata integrity in its filesystem. Bitcoin uses double SHA-256 functions for the block chain, transactions, and mining. Cryptographic hashing algorithms are used EVERYWHERE. Unfortunately, developers keep reinventing the wheel, and don't pay attention to the problems that come with cryptographic hashing algorithms.

Failure #1- Raw Speed
While certainly an advantage, the speed at which cryptographic hashing algorithms creates the digest can also be a big drawback. Consider storing passwords to disk. Hopefully by now, developers know not to store the passwords in plaintext in the database. If not, get a new job. So, if we hash the password with say SHA-512, then according to our principles discussed at the beginning of this post, it should be infeasible to find the password (message) that produced the hash stored on disk.

If this hashed password database is leaked, the password cracker will be attempting to hash as many messages as possible to retrieve the given digest. Typically, this is done using custom word lists, hashing each word, and comparing it to what is stored in the password database. The faster the password cracker can go, the sooner they can recover the digests. On my wimpy laptop, using the CPU alone, I can hash approximately 1 million passwords per second with SHA-512. Using cheap GPU clusters, it's feasible to go upwards of 50 billion passwords per second, for basement-type miscreants living in their mother's basement.

Possible Speed Fix #1- Key Streching
One way around brute forcing passwords is a concept called key stretching. This is the concept where I would take the hash of the hash of the message. In other words, the message is recursively hashed. So, if my password is hashed 5,000 times, then it would take the password cracker 5,000 times more work to get to the original message. So, my wimpy laptop is reduced to going through only 200 passwords per second, instead of 1 million. Anything to slow the attackers down, is a good thing.

A password key stretched twice would look like this, where "H" is the hashing algorithm:


For example, taking the SHA-256 of "password" twice gives the following hexadecimal output:

SHA-256(SHA-256("password")) = 73641c99f7719f57d8f4beb11a303afcd190243a51ced8782ca6d3dbe014d146

Possible Speed Fix #2- Key-based Derivation Functions
Another alternative, and there is a good following for this on the Internet, is using key-based derivation functions. Without getting into too much detail here (I can save this for another post), key-based derivation functions like PBKDF2, bcrypt(), and scrypt() are slow, slow, slow, and very resource intensive. As such, this greatly limits the speed in which the password cracker can recover my password, much like key stretching previously mentioned.

Failure #2- Hashed Messages
Unfortunately, there's this thing called rainbow tables. Rainbow tables simply put, are large databases with message to digest mappings. Rainbow tables are built by hashing messages of all types, and storing both the message and the digest in the database. Then, all it takes is to look up the digest in the database. If it exists, you have the message that produced it. In this case, you have the password.

Thankfully, rainbow tables are usually bloated in size, typically several terabytes. The standard password cracker isn't going to dedicate 10 TB of disk just to store rainbow tables for MD5, SHA-1, and NTLM hashes. However, due to the one-to-one relationship (in practice) of messages to digests, rainbow tables are a very effective way to recover messages from digests only.

Possible Message Fix- Salting Hashed Messages
The number one way to thwart rainbow table attacks, is to prepend a cryptographic nonce, also called a "salt" to the password. The larger the space of the salt, the more possibilities the password could be when stored to disk. For example, if the salt is chosen from a base64 set of characters, and the salt is 8 characters long, then the salt's keyspace is 64^8 or 281,474,976,710,656 possible salts. This means that the rainbow table must store 281,474,976,710,656 digests for every message. As mentioned, rainbow tables are already several TB in size. Now they must be 281 trillion times larger.

A salted message looks like this, where "H" is the hashing algorithm, and "||" indicates concatenation:

H(salt || message)

For example, if my salt was "FhuVTwA710", and the password is "password", then taking the SHA-256 would give the following output:

SHA-256("FhuVTwA710" || "password") = 4c700717f3e5eb92a872362108f2f716d2ff179ea94d1f10853a50e181a43663

Failure #3- Salted Message Authentication Codes
Message authentication codes, or MACs for short, are a way to send a message between two parties that have previously authenticated. The idea is to send a message from Alice to Bob, such that when Alice's message reaches Bob, he knows whether or not if the message has been tampered with, based on if the hash of the message produces the same hash as what Alice sent. Alice and Bob must have met previously, and agreed on a key that they would use as a "salt" for each message they send.

In other words, their MAC setup looks like this, again where "H" is the hashing algorithm, "key" is the key both Alice and Bob agreed on, and "message" is what is being sent:

H(key || message)

For example, suppose the message was "The quick brown dog jumps over the lazy dog.", and the agreed-upon key is "J6n^,1=/#;RCA5bC10". Then our SHA-256 MAC would look like this:

SHA-256("J6n^,1=/#;RCA5bC10" || "The quick brown dog jumps over the lazy dog.") = 10c98713fbaa574ad9b2ffc053662c89f6fd387a8a350f6324b966d36936d1d3
MAC = ("The quick brown dog jumps over the lazy dog.", "10c98713fbaa574ad9b2ffc053662c89f6fd387a8a350f6324b966d36936d1d3")

Alice would send the full MAC (both the message and its digest). The idea is if the message changes, the digest will change, and as such, Bob will be able to detect tampering. Eve, a malicious 3rd party, should not be able to send fraudulent message to Bob, because Eve does not know the key Alice and Bob agreed on.

However, SHA-256 uses the Merkle–Damgård construction. Merkle–Damgård is iterative in nature- IE: block-by-block. So, while Eve may not know the private key, she was able to grab both the message and its digest in transit. So, all Eve needs to do is concatenate additional data after the message, know the length of the key, calculate the new digest, and ship this message to Bob. This is known as the length-extension attack.

In other words, if you know what the digest of H(key || message) is, then you can easily determine what H(key || message || ANYTHING) is. So to compromise the MAC system, you just need to see one message and digest MAC, then you can impersonate the sender from there on out. This should bother you.

Possible MAC Fix #1- The Secret Suffix
At this point, you may be thinking to switch the order of the key and the message, or H(message || key). This is known as the "secret suffix", and it is certainly a very valid fix to address the length-extension attack. However, it comes with a large assumption that you are making about the underlying cryptographic hashing algorithm.

That assumption is that the algorithm does not have any known digest collisions. In other words, if two messages m1 and m2 can be found to produce the same digest, then the attacker will be able to use this collision on the sent message, to use a different start message, putting the hashing algorithm state in the same starting position, and thus producing the same MAC, even though the key is still unknown.

Currently, both SHA-1 and SHA-2 do not have any known collisions, but it's only a matter of time.

Possible MAC Fix #2- The Enveloped MAC
Another possible solution is to envelope the message around the key, or H(key || message || key). This requires that the attacker know the key length, in order to identify the starting point of the message, and thus be able to forge a valid MAC. While more secure than the secret suffix, there has been some research in this area that suggests weaknesses to this approach, even when two different keys are used.

Possible MAC Fix #3- SHA-224, SHA-384, or SHA-3 (Keccak)
An actual solid fix is to use SHA-3, which is based on Keccak, for your hashing algorithm. SHA-3 is not vulnerable to the length-extension attack, and as such, can be used for a MAC. This is due to the fact that SHA-3 is based on the sponge construction, which is not an iterative block-by-block compression function like Merkle–Damgård is. With the existence of SHA-3, we may not need to worry about the next section for sending MACs.

Also, SHA-224 and SHA-384 are not vulnerable to the length-extension attack, due to the internal truncation of the internal state, whereas MD5, SHA-1, SHA-256, and SHA-512, among others, output the entire state.

Solution- The Hash-based Message Authentication Code (HMAC)
HMAC securely addresses the MAC length-extension attack. It does so by taking the hash function twice in a deterministic manner, first with an inner key, and again with an outer key. The algorithm is shown here:

  1. Prepare the key:
    1. If the key is less than the hashing algorithm block size, then append zeros to the key until it is the same size as the hashing algorithm block size.
    2. If the key is greater than the hashing algorithm block size, then take the hash of the key, using the digest as the new key.
  2. Prepare the ipad and the opad:
    1. To create the ipad, repeat the byte 0x36 the hashing algorithm block size times.
    2. To create the opad, repeat the byte 0x5C the hashing algorithm block size times.
  3. Take the XOR of the key and the ipad. Call this the ipad_key.
  4. Take the XOR of the key and the opad. Call this the opad_key.
  5. Take H(ipad_key || message). Call this digest1.
  6. Take H(opad_key || digest1). This is your MAC.

HMAC has the flexibility that ANY hashing function can work that is not vulnerable to preimage attacks. This includes MD5 and SHA1, as well as the other functions listed at the beginning of this post. The security of HMAC rests on the fact that Eve can take as many MAC messages and digets as she wants, but she'll never be able to determine the digest of the a message she hasn't seen yet.

In reality, HMAC can be used to store passwords. It can be used as data integrity. It can be used to pseudonymize usernames. In fact, a good rule of thumb on when to use HMAC and when not to, is asking yourself the following question: Is what I am storing or sending need to be secret? If so, HMAC.

Cryptographically Secure Pseudorandom Locally Administered Unicast MAC Addresses

Recently, Apple released the ability for iPhone 5c and newer hardware to create a spoofed software MAC address for 2.4 GHz and 5 GHz wireless access points. The MAC address is locally administered, and a unicast address. This has sparked a small discussion in various forums about how to generate valid locally administered unicast MAC addresses. It is necessary that the MAC address is unicast, as many Cisco switches and routers will block non-unicast addresses. It's not necessary that the address be locally administered, however. This just takes the address out of the globally administered range, and the possibility of conflict with other devices connected to the same switch.

According to the IEEE, in order to create a unicast address, the least significant bit of the most significant byte must be 0. This means the most significant byte must be an even number, eliminating half of the 256 valid possibilities for the first byte. This leaves us with only 128 numbers. In order to be locally administered, the second least significant bit in the most significant byte of the address must be 1. This eliminates another 64 addresses from the total space for this byte. As such, there are only 64 possible values this address can start with. The rest of the bytes can be as random as you wish, giving you a total space of 70,368,744,177,664 addresses to choose from.

In the Unix shell, you can execute the following code, which should be fairly platform agnostic:

$ random_mac() {printf '%02x' $((0x$(od /dev/urandom -N1 -t x1 -An | cut -c 2-) & 0xFE | 0x02)); od /dev/urandom -N5 -t x1 -An | sed 's/ /:/g'}
$ random_mac

Here are 16 valid locally administered unicast MAC addresses generated from the shell:

$ for i in {1..16}; do random_mac; done

If you wanted to have a random mac address assigned to your wireless NIC every time you brought up your network interfaces on Debian or Ubuntu, you could write the following shell script, and place it in the "/etc/network/if-pre-up.d/" directory:

LLADDR=$(printf '%02x' $((0x$(od /dev/urandom -N1 -t x1 -An | cut -c 2-) & 0xFE | 0x02)); od /dev/urandom -N5 -t x1 -An | sed 's/ /:/g')
ip link set address $LLADDR wlan0

Make sure it's executable:

$ sudo chmod a+x /etc/network/if-pre-up.d/

Playing Card Ciphers

For the past couple of weeks, I've been focused heavily on hand ciphers for field agents. Although I'm certainly no expert on cryptography, aside from the One Time Pad (OTP), I've had a hard time finding any hand cipher that would be considered secure in the computer age. It's certainly no doubt that field agents are very likely using computing with SSL and GPG, among other crypto tools, to communicate with each other. The romantic days of "Spy versus Spy" encrypting and decrypting notes by hand, doing dead drops in a tree stump in a park, and speaking 15 foreign languages fluently, are probably over. Regardless, I could not bring myself to believe that there were absolutely no secure hand ciphers to study.

I had already been familiar with Bruce Schneier's "Solitaire" card cipher, and always considered that a fairly creative use for a hand cipher. Despite its bias, it's still very secure, although slow to execute by hand, and very error prone. But this got me thinking- has anyone else attempted to create hand ciphers with playing cards, and if so, how far have they taken it?

After searching the Internet, I found five additional playing card ciphers, and implemented one mechanical cipher into a playing card cipher, Each comes with their own varying levels of security. While I was at it, I created my own playing card cipher, and I'm still currently evaluating its security. That brings to total list to eight playing card ciphers:

As you can see, I've spent a great deal of time learning each of the algorithms, and have typed them up by hand on my own personal wiki. Quadibloc is the only card cipher, at the time of this writing, that I am still learning and working on. I'm hoping this can be a centralized repository for all things playing card ciphers. My goals for this personal project are:

  • Publish software implementations of each card cipher.
  • Publish online videos giving a tutorial of each card cipher
  • Learn the strengths and weaknesses of each card cipher.

The big advantage of using playing cards, in my opinion, is the ability for the deck to maintain state while working through the algorithm. A standard deck of cards can have a maximum key space of 52! which is about 238-bits of entropy. This is likely larger than many SSL keys on the Internet protecting your bank login. So, provided the algorithm is non-linear, mixes the deck thoroughly on each round, and is not biased, it is possible that the algorithm could resist attacks from even the most well funded adversaries.

There is still work to be done, and I doubt this will be of any value to the general cryptographic community. After all, the OTP can be mathematically proven to be unbreakable, and computer algorithms are fast and error-free. So, who in their right mind would want to learn hand ciphers with playing cards, when they won't have a mathematical proof of unbreakability, they're slow, and error-prone?



I've been obsessing over the past couple weeks trying to improve Bruce Schneier's solitaire cipher, aka "Pontifex". The more I think about it, the more I realize that there just isn't a lot that can be done about the bias of Pontifex without severely slowing down the already slow algorithm. So, instead of trying to improve on his algorithm, I came up with my own - "Talon".

This cipher discards the two jokers. You only need the standard 52-cards from a poker or bridge set (4 suits, Ace through King). As with Pontifex, this is a output feedback mode stream cipher. Also, when determining the value of the output card, the same suit order is used:

  • Clubs - face value + 0
  • Diamonds - face value + 13
  • Hearts - face value + 26
  • Spades - face value + 39

An unkeyed deck would have Ace through King of Clubs, followed by Ace through King of Diamonds, followed by Ace through King of Hearts, followed by Ace through King of Spades.

This algorithm is executed in 4 steps. You will be making four discard piles, or "talons", labeled 1, 2, 3, & 4 from left to right:

  1. Create four discard piles. With the deck face-up in your hand, place the top card in discard pile #1, the 2nd card in discard pile #2, the 3rd card is discard pile #3, and the 4th card in discard pile #4. For example, if the deck was unkeyed, then the Ace of Clubs would be in discard pile #1, the Two of Clubs in #2, the Three of Clubs in #3, and the Four of Clubs in #4.
  2. Note the face value of discard pile #1, ignoring suit, and count that many cards minus 1 from the top of the deck, and place them on top of discard pile #1. If the card was a Jack, then count 10 cards from the face up deck in your hard, and place them on top of the Jack. Do the same for the other three piles, in order (#2, then #3, then #4). In other words, the first card placed down in the discard pile, or "talon", will determine the total number of cards in that stack.
  3. Collect the piles by placing discard pile #1 on top of pile #2 on top of pile #3 on top of pile #4, and place the stack behind the face up deck in your hand. If all 52 cards were in discard piles (13 cards in each pile), then place the newly collected stack in your hand, face up.
  4. Find the output card by summing the deck value of the top card, with the deck value of the bottom card, modulo 52. Count down that many cards into the deck, and record the value of the next card. If the top card is a Queen of Hearts, then the value would be 12 + 26 = 38. And if the bottom card is the 3 of Diamonds, then the value would be 3 + 13 = 19. (38 + 19) mod 52 = 5. Count 5 cards from the top of the deck, and record the face value of the 6th card, including suit. This ensures that each card is equally likely to be the output card as the rest in the deck. To make sure you are doing the counting correctly, if the sum value mod 52 is 0, you would record the value of the top card. If the sum value mod 52 is 51, you would record the value of the bottom card.

The key lies in the initial order of the deck, as with other designs. It can be keyed with random shuffling, bridge puzzles, or passphrases. If a passphrase is used, step 4 is replaced:

  1. Pass cut. Get the numerical value of the first character in your passphrase. As with the original step #4, count that many cards minus 1 from the top of the deck, and cut them below the rest of the cards. A = 1, B = 2, ..., Y = 25, Z = 26. This step will break the reversibility of Talon, while keying the deck.


Suppose we start with an unkeyed deck. Our top card will be the Ace of Clubs, with the deck face up in our hands, and the bottom card will be the King of Spades.

After step 1, we would have the following 4 discard piles:

#1  #2  #3  #4
AC  1C  2C  3C

After step 2, our discard piles would look like:

#1  #2  #3  #4
AC  2C  3C  4C  <-- Bottom of discard piles
    5C  6C  8C
        7C  9C

Remaining in my hand would be:

After step 3, the order of my hand would now be:


For step 4, the Jack of Clubs is the top card. Thus, its numerical value is 11 + 0 = 11. Counting 11 cards gives me the Nine of Diamonds as my output card. I would write down 22 as my output number (9 + 13 = 22).

After another round, my deck would be ordered as:


Because my top card is the Nine of Spades, it's numerical value is 9 + 39 = 45. Counting 45 cards gives me the Four of Spades as my output card. I would write down 43 as my output number (4 + 39 = 43).

Talon is reversible, does not need an IV like Mirdek, and less error-prone than Pontifex. It greatly reduces the chance of a bias, by fast mixing the internal state through the discard piles with 4 cuts in total in 1 round. Unfortunately, according to Bruce Schneier, the chances of this cipher being secure and efficient are negligible.

I see about two new cipher designs from amateur cryptographers every week. The odds of any of these ciphers being secure are slim. The odds of any of them being both secure and efficient are negligible. The odds of any of them being worth actual money are virtually non-existent.

Currently, I do not have an implementation in Python or C. I also do not know the length of the internal PRNG. My testing shows it is sufficient for small messages, such as the length of a tweet, which is practical for field operation. My testing also shows that there is no internal bias in the system. One thing I did catch during my analysis, is that the sum of the discard piles follows the Poisson Distribution. I'm not sure how this will affect the security of Talon, if any.

The Outrageous Fares Of The Utah Transit Authority

I hope my readers don't mind, but I'm going to break from the standard geekery for a second, and type up a personal post about something that has been bugging me for the past few years.

For those that don't know, I live in Utah, and work in Salt Lake City for the best local ISP on the planet. I also teach a Linux certification course at the University of Utah 1 night per week as a faculty adjunct instructor. As a benefit of teaching at the University, I get a monthly Utah Transit Authority (UTA) premium transit pass that allows me to ride FrontRunner, TRAX, and standard bus services, without paying out of my personal pocket.

I'll spare you my personal philosophy that all public transit should be paid for in gas taxes at the pump. However, according to a report by KSL, the UTA is one of the most expensive transit systems in the United States. The Salt Lake Tribune also ran a similar story. Just in case the link goes dead, here is a paragraph that I found very telling:

Of 169 U.S. transit agencies that participated in the APTA survey, only five will now have higher base fares for buses than UTA. They are in New York City, $2.75; San Francisco, $4; Nashville, $4; Sugar Land, Texas, $3.50; and Eden Prairie, Minn., $3.

Another four systems match the $2.50 to be charged for buses by UTA. They are in Atlanta, Pittsburgh, Portland and Sacramento. The 2012 APTA survey said the median base fare for buses among U.S. transit agencies was $1.50 per trip.

UTA's TRAX also will have among the highest fares nationally for light rail.

No agency included in the 2012 APTA survey now has fares higher than the $2.50 to be charged by UTA — but four match it. They are in Sacramento, San Diego, Pittsburgh and Dallas. Of 21 light-rail agencies surveyed by APTA in 2012, the median light rail base fare was $2 a trip.

That means among transit systems nationally that have lower bus or train fares than UTA are such places as Chicago, Los Angeles, Phoenix, Denver, Philadelphia, Cleveland and Dallas.

Now, I take advantage of my university pass on the UTA, when it's valid. However, it's only valid during trimesters. Between trimesters, because I am adjunct faculty, my card is deactivated on the last pay period of the trimester, and activated on the first pay period of the trimester. So, there is about a full month where I cannot use my university pass, and must pay full fare ride transit.

However, the UTA is offering a 20% discount if you use their tracking FAREPAY card. It's a pre-paid card that you can use as often as you need, provided a balance remains on the card to pay the fare. Because I am currently using the FAREPAY system for about 3 months of the year, I was curious how I could optimize my costs between trimesters. Currently, I drive to a FrontRunner station (a diesel commuter train) which takes me to a TRAX stop (a light rail train). The stop is in the "free fare" zone, so there is no charge to take TRAX to the stop near my office.

So, is it cheaper in both gas and UTA fares to drive to the closest FrontRunner station? Or should I drive a station or two away? Where will I get the cheapest per month cost for getting into work from home? I only live 30.7 miles away from the office by the most direct route.

Turns out, it doesn't involve the UTA at all. Not only is driving directly into the office considerably faster (45 minutes one way commute versus 65 minutes on transit), but it's cheaper. Considerably cheaper.

Here is a table showing my work. There are a total of 6 FrontRunner stations between my home and my office. In order, they are Clearfield, Layton, Farmington, Woods Cross, North Temple, and Salt Lake Central. I work a standard 9-5 M-F day job, and once per week, I must take TRAX up to the university.

First, I have a fuel efficient car. I've averaged about 32-35 miles to the gallon with mixed city and freeway driving from home to work and back. I've actually logged 45 mpg many times, but I'm usually doing almost straight freeway driving with hypermiling strategies, and it isn't that difficult to get 40 mpg in my car. Regardless, I'll use 32 miles per gallon as the basis for my commuter costs. With an average Utah state gas rate of $3.61 per gallon, this puts my monthly costs for gasoline at $150 per month to commute round trip to work from home. Looking over my payment history with my bank, this value is almost exactly spot-on.

Here are the distances from my house to each of the FrontRunner stations:

To Station One Way Gallons PPG Gas Round Trip
Clearfield 4.2 mi 0.13125 $3.61 $0.47 $0.95
Layton 6.6 mi 0.20625 $3.61 $0.74 $1.49
Farmington 13.4 mi 0.41875 $3.61 $1.51 $3.02
Woods Cross 20.6 mi 0.64375 $3.61 $2.32 $4.65
North Temple 29 mi 0.90625 $3.61 $3.27 $6.54
Salt Lake 30.3 mi 0.946875 $3.61 $3.42 $6.84

Looking strictly at UTA fares, here is what it would cost me to get into the office from each FrontRunner station:

From Station Fare 80% Fare Round Trip 80% Round Trip
Clearfield $4.30 $3.50 $8.60 $7.00
Layton $3.70 $3.00 $7.40 $6.00
Farmington $3.10 $2.50 $6.20 $5.00
Woods Cross $2.50 $2.00 $5.00 $4.00
North Temple $2.50 $2.00 $5.00 $4.00
Salt Lake $0.00 $0.00 $0.00 $0.00

So, combining gas prices and UTA fares, here would be the total cost per stop to get to my office:

From Station One-way 80% One-way Round Trip 80% Round Trip
Clearfield $4.77 $3.97 $9.55 $7.95
Layton $4.44 $3.74 $8.89 $7.49
Farmington $4.61 $4.01 $9.22 $8.02
Woods Cross $4.82 $4.32 $9.65 $8.65
North Temple $5.77 $5.27 $11.54 $10.54
Salt Lake $3.42 $3.42 $6.84 $6.84

It's clearly cheaper to drive all the way into the Salt Lake Central station, and take TRAX to the office, as that rail fare is in the "Free Zone", and no extra cost to me. However, if I don't want to drive all the way into Salt Lake, then driving to the Layton station 6.6 miles away is the most optimal.

I can't forget that once per week, I need to take TRAX up to the university to teach. This is a $5 round trip ticket, which is good for all destinations on TRAX, all day long, or $4 with FAREPAY. Including that into my final cost, what would my gas and UTA fare costs look like in 5-day work week? Per month? Per year? (Note: the "80%" headers show the cost of gas plus the 20% FAREPAY discount. It is not 80% of each total.)

From Station Weekly 80% Weekly Yearly 80% Yearly Monthly 80% Monthly
Clearfield $52.74 $43.74 $2,742.38 $2,274.38 $228.53 $189.53
Layton $49.45 $41.45 $2,571.17 $2,155.17 $214.26 $179.60
Farmington $51.12 $44.12 $2,658.08 $2,294.08 $221.51 $191.17
Woods Cross $53.24 $47.24 $2,768.45 $2,456.45 $230.70 $204.70
North Temple $62.72 $56.72 $3,261.21 $2,949.21 $271.77 $245.77
Salt Lake $39.18 $38.18 $2,037.47 $1,985.47 $169.79 $165.46

If I drove to the Salt Lake Central station 5 times per week, and paid for a round trip TRAX ticket to the university once per week, I'm looking at about $170 per month, or $165 with FAREPAY. If I take FrontRunner from my most cost-efficient station, it's $214 per month, or $180 per month with FAREPAY. From the closest station, I'm looking at $230 per month or $190 per month with FAREPAY. UTA charges $198 per month for a premium pass (which is required to use FrontRunner and express buses), which doesn't include my gas costs driving to each station.

$214 is over a 60% increase over commuting to work in my car (the FAREPAY discounts ends Dec 31, 2014), and $198 is almost a 50% increase. I will spend an extra $700 per year to the UTA for using their services. Over 10 years, that's almost half the cost of a base model Toyota Corolla, brand new. So, when purchasing a new car, if there are plans on using the UTA, realize that the cost of your new car went from $15,000 to $22,000.

As a frequent rider of the UTA system, watching people use their passes, gauging who is riding by dress, demeanor, and conversation, I would dare say that 80-85% of ridership is only riding FrontRunner and TRAX, because they have an education pass or their pass is paid for or reimbursed by their employer. This means that most people are riding the UTA, because they don't have to drive, and they're not paying for it personally, which is probably due to the astronomical fare prices. Of course, I don't have access to their ridership data, so this is just speculation based on observation while riding. I doubt I'm far off though.

I would like to say that the fare goes to quality services, buses, and trains, but it doesn't. The FrontRunner and TRAX stops have very little overhead protection from the sun and weather, and even fewer seats. The FrontRunner has three state-of-the-art Bombardier double-decker commuter cars, and 1 refurbished 1970s Comet car. Some TRAX trains are new, many are old and tattered. Service reliability is not great across the rail systems (I don't have any numbers, just personal experiences), and some stations are in need of repair (the Clearfield station was getting some tile replaced, but it's only half-finished, and has remained so for a few months, with no evidence of progress).

It is true that I don't have to deal with traffic jams on the freeway, but it has rarely taken me more than an hour to reach the office, even in the most deadlocked situation. I don't have to deal with bad weather driving, which is a bonus, and I can get work done while on the train, provided I can find a table to work on in the commuter car. Despite these benefits, I hardly think they are worth the extra $700 per year you will give the UTA for these luxuries, provided the setbacks of their infrastructure.

As a former resident of Toronto, Canada, the UTA fares are outrageous. I came from a city where fares are currently less than $135 per month for access to all TTC buses, street cars, light rail, and subways. The TTC rail stations are vastly superior to UTA stations, providing adequate overhead protection, generous seating, and are well-maintained. The system is much more reliable (I didn't experience a single delay in over two years) and the train cars are more up-to-date in a wider fashion. I wish I was familiar with other transit systems to compare, but the TTC is the only other transit system I'm deeply familiar with. Notice also that while an individual ride on the TTC might be $3, the monthly pass is $133. Compare to the UTA where a single ride is $2.50 and the monthly premium pass is $198 (it's true that the TTC does not include Go Transit, the long-distance commuter rail, and Go Transit does not offer discounts for TTC card holders, so this may not be exactly apples-to-apples).

I hope the UTA comes down considerably in price (about half), because when the University of Utah decides it doesn't want to pay the UTA for my transit pass, I'll be back on the road commuting in my car.

As a partial solution to reducing fares, aside from increasing the state gas tax, maybe the executive officers, directors, and managers of the Utah Transit Authority could take a pay cut. I'd wager that $511,000 puts the COO of the UTA in the top 1 or 2 percent of Utah salaries.

What's The Matter With PGP? has been making its way around crypto circles recently. Enough so, that I figured I would comment by posting a reply on my own blog.

This post is written by cryptographer Matthew D. Green. As a result, even though I've never heard of Dr. Green, when I read the post, I was expecting something of high caliber. After all, this isn't some random dude on the Internet blowing his horn. THIS IS A GENIUNE CRYPTOGRAPHER. Turns out, the post is complete junk. It's about as thoughtful and informative as the Nigerian 411 scam. Let's analyze the post, top-to-bottom.

First off, as a TL;DR, Dr. Green thinks PGP is Teh Fail. After reading the post, it almost seems to me that he just barely installed PGP on his desktop, couldn't figure most of it out, got confused by a great deal of the terminology (isn't he a cryptographer?), and threw his hands up in defeat. When you read it a 2nd time, you notice almost immediately that he is confusing a few things:

  • The OpenPGP specification.
  • The GnuPG implementation of the OpenPGP specification.
  • Mail client frontends to GnuPG.

Are they all Teh Fail, or just the OpenPGP specification? After reading it a few times, I'm convinced he is rolling up anything and everything to do with PGP, past, present, and future, as total failure. It's just unfortunate he can't set things straight between the OpenPGP specification, the GnuPG implementation, or a mail client.

Meh. Color me unimpressed, Dr. Green.

1. PGP Keys Suck
My favorite line out of his post is this:

"For historical reasons [PGP keys] tend to be large and contain lots of extraneous information, which it difficult to print them a business card or manually compare."

Yes, they are definitely large. But printing the entire key on a business card? Really?? I've been using GnuPG since 2004, and I've never heard of anyone wanting to even compare public asymmetric keys character-by-character, let alone print a full asymmetric key on a business card. This is what the SHA-1 fingerprint is for. It's been generally accepted that comparing keys using the 40-character hexadecimal string is sufficient enough to know that each person has the same key. Key signing parties are based on this principle. Fingerprints as a result have been printed on business cards. Example 1, example 2, example 3.

Regardless, asymmetric keys in general are nasty to deal with by hand. This isn't just a problem with GnuPG, but OpenSSH, OpenSSL, or any other front end implementation that uses public key cryptography. This shouldn't come as a surprise. This is why we have tools to work with them. Things like Gnu Privacy Assistant, KGPG, Enigmail, GnuPG, and others. There is nothing in the OpenPGP specification that defines how you should implement such a front end. As a result, this is not the fault of OpenPGP. If you don't like the way GnuPG handles transferring public keys from one destination to another, propose a patch?

Without taking a breath, he then complains that your request for a GnuPG key is over plaintext HTTP. Well, okay. Sure. HTTPS/HKPS would be optimal to transfer the key, and he rightfully identifies a bug with GnuPG that doesn't verify if the correct key was downloaded. But, again, this is not a problem with OpenPGP. OpenPGP does not define anything about infrastructure. These servers are poorly designed, with OpenPGP front ends like GnuPG full of bugs. Hardly a problem with OpenPGP. If you don't like the key server implementations, maybe submit a patch?

Then his rant turns to OpenPGP infrastructure, by bitching about public key servers (it actually seems like he's comparing everything to miniLock, and setting miniLock as the Gold Standard). His first gripe is getting keys via Twitter:

"If we must exchange things via Twitter, why not simply exchange keys?"

Really? Twitter? Do we want to rely on Twitter for getting public keys? We can't get them from an email, or in person, or from a website, or from a business card, or some other form of informational transfer? Just, Twitter?

Finally, he opens up ECC keys, with the following:

"Modern EC public keys are tiny."

Seems Dr. Green has some sort of obsession with ECC keys. Earlier he mentions that you can fit ECC keys on business cards. He continues comparing a lot of OpenPGP to ECC keys. I'll address that later.

2. PGP key management sucks
There really isn't much to say here. Every point I argued previously, in that OpenPGP provides no infrastructure, applies here. However, he does hit one topic which strikes me as odd: the Web of Trust. He says the following:

"Most OpenPGP implementations do a lousy job of presenting any of this data to their users anyway."

OpenPGP implementations should never define trust for the user. That should always be at the discretion of the user. OpenPGP implementations should only define accuracy, such as correctly decrypting an encrypted piece of text, or successfully verifying a digital signature, because that's all the OpenPGP specification defines. I decide whether or not to trust the data given to me by someone else. And, most of the tools I have dealt with properly work this way. A trust database exists on the computer, allowing me to say "I trust this key", "I don't trust this key", and so forth. I define how those trust relationships are built, and I can put them into the database, so OpenPGP front ends can notify to me whether or not I am dealing with trustworthy data. But notifications should be the extent of the Web of Trust in OpenPGP implementations. And just because I signed a key, doesn't necessarily mean I trust that data from that key owner, so the tool should never make that assumption.

3. No forward secrecy
Okay. I get it. People like perfect forward secrecy (PFS), even to the point of obsession. I'm not one of those people. I use OTR, and I implement PFS on SSL where appropriate, but I'm not going to lose any sleep over the fact that PFS isn't defined in OpenPGP. OpenPGP has its problems. PFS isn't one of them, for a couple of reasons.

First, email is generally looked upon as long-term communication. Most people want to read their email, not destroy it. Many use long-term archival to go back to email replies in a thread, or to look up purchase orders, or recall sensitive employee information. Email has value in multiple views. PFS just doesn't fit well here. If you want PFS in your communication, look at something where there is not high value in retrieving old communication, such as instant chat. Off-the-record (OTR) messaging works well here, but even then, it has it's problems. As someone who logs every chat message that comes in, I want to have the ability to decrypt old communication messages sent to me, for a number of reasons. OTR makes this impossible. OpenPGP makes this doable.

Second, GnuPG cryptographically signs encrypted data by default. Thus, if you were to use an OpenPGP implementation for end-to-end communication, all messages have been cryptographically signed. Enabling PFS is perpendicular to signed data. Prior messages should not only be un-decryptable, but they should also not identify who sent the message. Because OpenPGP implementations build everything around a Web of Trust, I find it difficult to see where PFS would fit at all.

However, this sentence down right scares me:

"If the NSA is your adversary just forget about PGP."

What does that mean? The NSA has beaten OpenPGP? I doubt it. Strongly doubt it, actually. So much so, I'm really beginning to wonder about Dr. Green's competence as a cryptographer. Dude, look at the specification, and tell me the NSA has beaten every cryptographic cipher and hash in that RFC. I don't buy into the conspiracy theories that the NSA has some super-duper-uber-mega-cracking-datacenter, and AES-256 is child's play to the NSA. I trust the mathematics, and I trust academia. Maybe the NSA has their own microprocessor manufacturing plant? That has outpaced Intel? Yeah, not buying it.

However, maybe Dr. Green means to say that if you're targeted by the NSA, then before you even use your OpenPGP implementation to encrypt some communication, the NSA has a copy. But then, how is this a problem of OpenPGP and its implementations?

As a cryptographer, to wave your hand like a Jedi, claiming thet PGP is worthless in front of the NSA is frightening. Not because there might be some truth to it (there isn't), but because that's just amateur FUD slinging. For someone with a Ph.D in the field, I would expect better.

4. The OpenPGP format and defaults suck
Hahahahahahahahahahahah. Uh, no. Not even close. I literally laughed out loud when I read that section. So much so, I showed a couple of mathematicians and amateur cryptographers, and we had a good time.

Chapter 9 of RFC 4880 (the OpenPGP RFC) defines the following constants:

9.1. Public-Key Algorithms

ID Algorithm
-- ---------
1 - RSA (Encrypt or Sign) [HAC]
2 - RSA Encrypt-Only [HAC]
3 - RSA Sign-Only [HAC]
16 - Elgamal (Encrypt-Only) [ELGAMAL] [HAC]
17 - DSA (Digital Signature Algorithm) [FIPS186] [HAC]
18 - Reserved for Elliptic Curve
19 - Reserved for ECDSA
20 - Reserved (formerly Elgamal Encrypt or Sign)
21 - Reserved for Diffie-Hellman (X9.42,
as defined for IETF-S/MIME)
100 to 110 - Private/Experimental algorithm

Implementations MUST implement DSA for signatures, and Elgamal for
encryption. Implementations SHOULD implement RSA keys (1). RSA
Encrypt-Only (2) and RSA Sign-Only are deprecated and SHOULD NOT be
generated, but may be interpreted. See Section 13.5. See Section
13.8 for notes on Elliptic Curve (18), ECDSA (19), Elgamal Encrypt or
Sign (20), and X9.42 (21). Implementations MAY implement any other

9.2. Symmetric-Key Algorithms

ID Algorithm
-- ---------
0 - Plaintext or unencrypted data
2 - TripleDES (DES-EDE, [SCHNEIER] [HAC] -
168 bit key derived from 192)
3 - CAST5 (128 bit key, as per [RFC2144])
4 - Blowfish (128 bit key, 16 rounds) [BLOWFISH]
5 - Reserved
6 - Reserved
7 - AES with 128-bit key [AES]
8 - AES with 192-bit key
9 - AES with 256-bit key
10 - Twofish with 256-bit key [TWOFISH]
100 to 110 - Private/Experimental algorithm

Implementations MUST implement TripleDES. Implementations SHOULD
implement AES-128 and CAST5. Implementations that interoperate with
PGP 2.6 or earlier need to support IDEA, as that is the only
symmetric cipher those versions use. Implementations MAY implement
any other algorithm.

9.3. Compression Algorithms

ID Algorithm
-- ---------
0 - Uncompressed
1 - ZIP [RFC1951]
2 - ZLIB [RFC1950]
3 - BZip2 [BZ2]
100 to 110 - Private/Experimental algorithm

Implementations MUST implement uncompressed data. Implementations
SHOULD implement ZIP. Implementations MAY implement any other

9.4. Hash Algorithms

ID Algorithm Text Name
-- --------- ---------
1 - MD5 [HAC] "MD5"
2 - SHA-1 [FIPS180] "SHA1"
3 - RIPE-MD/160 [HAC] "RIPEMD160"
4 - Reserved
5 - Reserved
6 - Reserved
7 - Reserved
8 - SHA256 [FIPS180] "SHA256"
9 - SHA384 [FIPS180] "SHA384"
10 - SHA512 [FIPS180] "SHA512"
11 - SHA224 [FIPS180] "SHA224"
100 to 110 - Private/Experimental algorithm

Implementations MUST implement SHA-1. Implementations MAY implement
other algorithms. MD5 is deprecated.

Please identify what is wrong with this specification. GnuPG implements IDEA, CAST5, 3DES, AES, AES-192, and AES-256 as ciphers, and RIPEMD-160, SHA-1, SHA-224, SHA-256, SHA-384 and SHA-512 as cryptographic hashes. What are wrong with these defaults? Please, enlighten us Dr. Green.

However, Dr. Green is mixing what the OpenPGP specification requires and what implementations use. The RFC exists as a core minimum set of standards, defining what is required to use as a base, with things optional and deprecated. However, GnuPG (as well as PGP) have both gone well beyond the defaults that the RFC requires.

But then there's this little gem:

"Elliptic curve crypto is (still!) barely supported."

He's back at ECC. Of course, this shouldn't come as much of a surprise, but there is a good reason most academic and open source cryptographers have steered clear of ECC- software patents:

According to Bruce Schneier as of May 31, 2007, "Certicom certainly can claim ownership of ECC. The algorithm was developed and patented by the company's founders, and the patents are well written and strong. I don't like it, but they can claim ownership."

Dr. Thomas Pornin (one cryptographer on Stack Exchange I have great respect for) answers the ECC patents this way:

"To my knowledge (but I am not entitled, in any way, to give law-related advice), parts of ECC cryptography which may still be patented are:

  • implementation of curves over binary fields using normal bases;
  • point compression;
  • acceleration of Koblitz curves using the Frobenius endomorphism;
  • various optimization tricks on dedicated hardware architectures (FPGA, ASIC)."

However, there is finally an RFC for ECC in OpenPGP. It defines three curves, which are also already standardized by NIST (pdf)- Curve P-256, Curve P-384, and Curve P-521. This document was standardized only two years ago, June 2012. Unfortunately, these three curves do not meet the "safe curves" definition as defined by Dr. Daniel J. Bernstien ("djb").

GnuPG will likely use EdDSA and Ed25519 as default for signatures (the same algorithms supported by OpenSSH actually). Werner did implement Brainpool P256r1, P384r1, and P512r1, even though it is not defined in the RFC.

5. Terrible mail client implementations
Again, mixing the OpenPGP standard, which does nothing to define how an MUA should behave, with a front end to a specific OpenPGP implementation. Regardless, there are some hidden gems in this section.

First, his claim that we're storing the passphrase to the private key in memory:

"[Mail clients] demand you encrypt your key with a passphrase, but routinely bug you to enter that passphrase in order to sign outgoing mail -- exposing your decryption keys in memory even when you're not reading secure email."

Hahahahahaha. Oh, boy. I guess we haven't heard of "GnuPG Agent"? Yeah, there's this thing that exists setting up a "token" based system. If you can successfully decrypt your private key, GnuPG will setup a token which is cached in RAM. Then, the MUA just needs to check if such a token exists. If so, it can use the full features of decryption and signing, without ever asking the user for the passphrase.

Regardless, if an attacker has access to your RAM to read it's contents, then you have bigger fish to fry, and there is nothing on the OpenPGP specification to help you.

Now this little gem:

"Most of these problems stem from the fact that PGP was designed to retain compatibility with standard (non-encrypted) email. If there's one lesson from the past ten years, it's that people are comfortable moving past email."

Really, I'm fairly confident that I don't need to say much here, but I'll just end this section with my own personal thought:

"If there's one lesson from the past 50 years, it's that people are not comfortable moving past email."

What's cute, is his examples to vendor controlled messaging systems provided by Apple, Facebook, and WhatsApp as a way to do secure communication. All proprietary, all vendor controlled lockin, all reliant on 3rd-party infrastructures, all client-server encrypted/decrypted, and not end-to-end, which OpenPGP is not. OpenPGP is end-to-end encryption, decentralized, and ad hoc. More importantly, the largest implementation (GnuPG) is Free and Open Source Software.

6. So what should we be doing?
Well for one, not listening to Dr. Green. OpenPGP has legitimate concerns, no doubt. But they aren't what are described in the article. He followed a white rabbit down a hole, when he should have just been studying the specification, and identifying it's real problems.

For example, if using OpenPGP in email, all of the email headers remain in plaintext. Only the body is encrypted. This metadata can reveal a great deal of information, such as the sender and recipient, the SMTP servers used in the exchange, the subject content, additional recipients on the carbon copy, etc.

Further, because OpenPGP encrypts data in packets, "header packets" exist in every OpenPGP encrypted file. These header packets can also reveal a great deal of information, such as who encrypted the data, who the recipient of the encrypted data is for, when the encrypted data was created, and other things.

In other words, one problem with OpenPGP is plausible deniability. If you want your encrypted data to be indistinguishable from true random, you're going to fail miserably using OpenPGP. OpenPGP concerns itself with message confidentiality and integrity- deniability just isn't in its scope.

For me, throwing away PGP, because of a number of misguided personal opinions, seems like a Big Mistake. GnuPG has been in development since 1999. OpenPGP itself has undergone revisions, improving the specification. PGP, which OpenPGP built itself upon, is even older, entering the arena at 1991. Throwing 23 years of development, bug fixing, and revisions out the window, just because one troubled cryptographer doesn't like it, seems to be akin to the very tardy, and incomplete Netscape 6.0 release.

Entropy As A Service

Back in October 2012, I announced It's an "entropy server" that delivers encrypted random bits that are indistinguishable from true random, directly to your GNU/Linux input entropy pool. Per the Linux CSPRNG source code, the bits in the input entropy pool are then cryptographically hashed with SHA1 before sending the bits to the blocking and non-blocking pools. The bits generated by my entropy server come only from 5 Entropy Keys, which appear to no longer be in production. I'm working on a "version 2.0" of the server, where more true random sources will be mixed into the pool, such as the onboard Broadcom HWRNG, atmospheric noise, haveged(8), and 2 TrueRNG HWRNGs. Keep an eye out both on this blog, and the entropy server web page for updates.

However, it appears that I am not the only one who has thought of this idea.

In September 2013, the U.S. National Institude of Standards and Technology introduced the NIST Randomness Beacon. Every 60 seconds, a random number is "broadcast", which is cryptographically signed, timestamped, and contains the value of the random number from the previous minute's broadcast. It comes complete with a REST API, which you can take advantage of for your internal application. They warn, very strictly, that these bits should not be used for cryptographic keys. Instead, it would be more fitting to use these random bits as for things such as lottery drawings, dice rolls, and other non-sensitive randomness implementations. However, you could write a REST application that took this data, and reseeded the Linux CSPRNG every 60 seconds. Because SHA1 is only theoretically broken, and not practically broken, and further because the Linux kernel is receiving other sources of input to reseed the CSPRNG, it seems difficult for an attacker to recover the internal state of your CSPRNG without compromising the computer, to determine the sequence of the CSPRNG. A simple shell script could be something like this (assuming you have rhash(1) installed):

# Reseed the Linux kernel CSPRNG with data from NIST
SALT=$(dd if=/dev/urandom bs=64 count=1 2> /dev/null)
EPOCH=$(date --date="$(date +%H:%M:00)" +%s)
DATA=$(echo "${SALT}${RESULTS}" | rhash --sha3-512 - | cut -d ' ' -f 1)
echo "$DATA" > /dev/random

Further, in February 2014, Canonical announced Canonical is providing their own application, which currently seems to be only packaged for Ubuntu, called pollinate. The idea is for newly created Ubuntu VMs or containers to immediately have a source of entropy to rely on, to quickly feed the Linux CSPRNG before it is needed in generating cryptographic keys, such as SSH or SSL keys. Whlie some criticize its security, it seems more like FUD from those who don't fully understand the internal workings of the Linux kernel CSRNG, than actual security concerns. Reseeding the CSPRNG is less of an issue for me, as much as Canonical tracking which IP requested bits from their server, and logging them (as should be the case for any service from any provider). It may be true that on first boot, not enough entropy may have been generated to seed the CSPRNG. As such, initiating a TLS session to a server may open the network connection up for an attack. While the theory might be sound, I would like to see a proof of concept or even an implementation of the attack, before I start FUD slinging.

So, as it sits, there are to my knowledge, three Entropy-as-a-Service (EaaS) servers online, each with different implementations, that require different software, that you could use for a variety of applications. Of course, there is also which has been on the Internet for a good long while, long before "as a service" was a buzz term. However, I don't see them as an entropy server as much as an implementation server. They are great for getting dice rolls, lottery picks, card shuffles, etc, but not so much "raw data", even though it does have a "random number sequence" generator, and it does provide an API.

With the revelations about the NSA and the increasing weight put on cryptography as a result, I'm guessing more of these types of servers will pop up. Also, only my service, so far, actually increases your entropy estimate in your kernel. The other services mentioned here only reseed the Linux kernel CSPRNG, without increasing the entropy estimate.

The Linux Random Number Generator

There is a lot of misinformation out there about /dev/random and /dev/urandom with regards to the Linux kernel. Unfortunately, not even the random(4) manpage seems to get it right. So, rather than argue with everyone on the Internet, I'll post the inner workings of the Linux kernel with respect to randomness here.

If you're not interested in reading the full post, then call this a good summary:

  • There are 3 separate and distinct entropy pools in the Linux kernel.
  • /dev/random and /dev/urandom both use the same CSPRNG.
  • Use /dev/urandom.

The Entropy Pools
There are three arrays which store unsigned integers called entropy pools in the Linux kernel, each which maintain their own state. They are the input pool, the blocking pool, and the non-blocking pool. The input pool maintains a maximum internal state of 4096 bits. When looking at the value of "/proc/sys/kernel/random/entropy_avail", you are looking at the current estimate of the input pool state. The blocking and non-blocking pools each keep an internal state of just 1024 bits.

Below is an image showing the architecture of the Linux kernel random number generators. Each of the three pools are green. You'll notice that the input pool is fed with data from external events, such as timing interrupt requests, disk and network I/O, as well as human input on the keyboard and from the mouse. The data is hashed with SHA1 as part of the mixing functions in the cryptographically secure pseudorandom number generator (CSPRNG).


The blocking and non-blocking pools both feed from the same CSPRNG to produce their random output. There are three interfaces which user space can access to get this data from the CSPRNG. The /dev/random character device will block when the blocking entropy pool is exhausted, whereas both the /dev/urandom character device, and the get_random_bytes() system call will not block, even when entropy has been exhausted.

What's important to understand here, is that both the blocking pool, from which /dev/random draws its data, and the non-blocking pool, from which /dev/urandom draws its data, get their data from the same CSPRNG.

Theoretic versus Computational Security
In fact, there are really just two types of security that you should be aware of: theoretic security and computational, or practical security.

Theoretic Security: This is security which can be proven through a logical mathematical proof. Two types of proofs come to mind (there are probably others): The One Time Pad, and Shamir's Secret Sharing. Both of these proposals can be implemented in software algorithms, even though there may be some difficulty in the implementation.

Computational Security: This is security where algorithms are build from proofs that demonstrate a weakness in classical computing, such as the discrete logarithm problem and the quadratic residuosity problem. It is known that it takes considerable energy, time, and effort to work through these problems when certain criteria is met, such as factoring out primes from a quadratic with hundreds or thousands of digits in the number. Because it's computationally difficult to solve these problems, we consider algorithms that are built from them, computationally secure.

Probably every cryptographic algorithm that you are aware of, such as AES, RSA, SHA, and so forth, are computationally secure, not theoretically secure. This means, that an adversary with unlimited time, resources, and processing power could break computationally secure algorithms. This is why bit sizes are ever increasing- as processing power and computing "tricks" improve, the size of the problem needs to increase.

Random Numbers
There are basically two ways which you can generate random numbers- in hardware or in software (if you are getting random numbers from a Geiger Counter, even though radioactive decay is found in nature, you are using a hardware device to feed it into the computer). In both cases, however, the sequences of numbers could be "true random" or "pseudorandom".

First, I'm not going to jump into the long argument that is "true randomness". This quickly turns into a philosophical debate. Some will say that "true random" only exists in quantum nature, while others will explain that even quantum mechanics lives by mathematical equations and predictable output that we just haven't discovered yet. For me, I'm content with calling the output of some quantum events "true random". While all mathematical algorithms in software can only produce pseudorandom numbers, some mathematical algorithms can have such a large cycle, and their algorithm computationally secure, that their output is indistinguishable from true random.

Further, even though you have a "true random" number from nature, how do you say a number by itself is random? It's usually more practical to call a sequence of numbers unpredictable, rather than "random". For cryptography, that's really the only thing that matters. Consider the following sequences of text- one was encrypted with AES, one is "true random" data fed from a hardware random number generator, and the other sequence was pseudorandomly generated:


Looking at those three sequences, one has a very rigid structure. It is legitimate encrypted data with AES, while the other two strings are 16 bytes of data read from either /dev/urandom or /dev/hwrng. While examining the sequences of bits, is it computationally feasible to determine the underlying structure that was used to build the encrypted ciphertext, and get back to the original plaintext, without the key? From a theoretic perspective, the sequence of encrypted bits should appear indistinguishable from a truly unpredictable sequence of bits, such as those grabbed from the hardware random number generator.

I'll give you the solution:

$ echo "Random numberS" | aespipe | xxd -ps
Password: 12345678901234567890

However, suppose you have "true random" numbers; maybe you got them from radioactive decay, diode breakdown, or atmospheric noise. Regardless, you have a sequence of bits that are as "true random" as you could get. So, now what are you going to do with them? If you're going to study some informational theoretic security algorithm, you might be doing the world some good. However, if you're going to subject these numbers to a computationally secure, man-made, software algorithm, running on a deterministic computer, what was gained by having those "true random" numbers? The weakness will then lie in the software algorithm, the unpredictable internal state of the computer, and human factors; not necessarily the numbers from which it was drawn. Subjecting "true random" numbers to a theoretic algorithm is one thing; subjecting them to a man-made computer software algorithm that is at best computationally secure, is another. In all honesty, you are not much better off with a CSPRNG giving you the numbers than the you are the Universe giving them to you, if all you need to do is generate a long-term GPG or SSL key.

Don't get me wrong. I'm not suggesting that having "true random" numbers is not a good thing. It is. Injecting "true random" numbers into the input entropy pool of the kernel reseeds the CSPRNG, putting it in a "true random" state, from which it can build your pseudorandom numbers. Having "true random" numbers is a good thing, and increasing the input pool entropy is a great thing. In fact, due to the nature of "reseeding" the CSPRNG, the Linux random number generator "self heals". In other words, if the internal state of the system was known at a fixed point in time, reseeding the CSPRNG with entropy feed by "true random" numbers, changes the internal state of the random number generator. This means that although all future numbers could be determined, because the state of the system was compromised, future output has been changed, because the internal state of the system has changed.

A self healing CSPRNG is a great thing to have. And guess what? If you're constantly feeding the input pool with "true random" noise, then the output of /dev/urandom will be completely indistinguishable from "true random" data sequences, due to this reseeding.

/dev/random versus /dev/urandom
And now we get to the meat of the issue- which output interface to use for "the most demanding cryptographic computational long-term security"? First thing is first- both /dev/random and /dev/urandom are fed data from the same CSPRNG. Let me repeat that, with a bit more emphasis: Both /dev/random and /dev/urandom are fed data from the same CSPRNG. In fact, most who understand this, will explain that you should use /dev/urandom, and NOT /dev/random. The problem is, that /dev/random has the horrible bug of blocking when the input entropy pool is exhausted. So, if your application relies on /dev/random to get random numbers, there are pauses and delays until there is enough entropy in the input pool to reseed the CSPRNG, so you can get more data.

The point? Stop using /dev/random and start using /dev/urandom. To back up my claim, here are some people who back me up:

Here, cryptographer Dr. Daniel Bernstein puts me, personally, in my place on a mailing list (I've learned the error of that line of logic since then):

Cryptographers are certainly not responsible for this superstitious
nonsense. Think about this for a moment: whoever wrote the /dev/random
manual page seems to simultaneously believe that

(1) we can't figure out how to deterministically expand one 256-bit
/dev/random output into an endless stream of unpredictable keys
(this is what we need from urandom), but

(2) we _can_ figure out how to use a single key to safely encrypt
many messages (this is what we need from SSL, PGP, etc.).

For a cryptographer this doesn't even pass the laugh test.

I'm not saying that /dev/urandom has a perfect API. It's disappointingly
common for vendors to deploy devices where the randomness pool has never
been initialized; BSD /dev/urandom catches this configuration bug by
blocking, but Linux /dev/urandom (unlike Linux /dev/random) spews
predictable data, causing (e.g.) the widespread RSA security failures
documented on But fixing this configuration bug
has nothing to do with the /dev/random superstitions.

Dr. Bernstein continues in a paper about the problems of having too much entropy, and why that could be a bad thing. However, he repeats the same thing as above: use /dev/urandom. Dr. Bernstein also created a cryptographic library called "NaCL", which relies on /dev/urandom for its randomness.

Cryptographer Dr. Thomas Pornin also explains the nature of /dev/urandom:

/dev/urandom yields data which is indistinguishable from true randomness, given existing technology. Getting "better" randomness than what /dev/urandom provides is meaningless, unless you are using one of the few "information theoretic" cryptographic algorithm, which is not your case (you would know it).

In an additional post, he mentions:

There are many applications which read /dev/random as a kind of ritual, as if it was "better" than /dev/urandom, probably on a karmic level. This is plain wrong, especially when the alea is to be used with classical cryptographic algorithms (e.g. to generate a SSH server public key). Do not do that. Instead, use /dev/urandom and you will live longer and happier. Even for one-time pad.

Even Bram Cohen agrees.

In terms of software packages, the Go language uses /dev/urandom as its source for a CSPRNG, where available. OpenSSL uses /dev/urandom when available. OpenSSH will use OpenSSL's random number source, if compiled with OpenSSL support, which uses /dev/urandom. OpenVPN will also use OpenSSL's random number source, unless compiled with PolarSSL. Guess what? /dev/urandom.

The only software package I can think of that does not use /dev/urandom, but instead deliberately uses /dev/random, is when generating keys with GnuPG. However, this is only to guarantee that there has been sufficient entropy in the system to guarantee "randomness". When the input entropy pool is exhausted, /dev/urandom will happily give away a non-blocking stream of random data, where /dev/random will block. Adding non-malicious entropy to your system is not a bad thing. As we mentioned earlier, reseeding the CSPRNG with entropy from random sources introduces a "self-healing" RNG. Thus, GnuPG ensures that there is enough entropy to reseed the CSPRNG, such that the state of your system has "healed", and is in a completely different state than before you started. However, if you just linked /dev/random to /dev/urandom, and generated your GPG key, it would be no less secure, provided that the system you generated the key on is not compromised in any fashion.

The point? Use /dev/urandom.

Cryptographically Secure Pseudorandom Number Generators
I know what you're thinking- a pseudorandom number generator could not possibly be as "good" as a true random number generator. In many cases, this is true. There are pseudorandom number generators which are not cryptographically (which means "computationally") secure. Mersenne Twister comes to mind. It's "fast enough" and will provide pseudorandom numbers as long as you wish. It's the pseudorandom number generator for a great deal of programming languages, such as Python, and rightfully so. It's a good choice. But, it's not computationally strong. After observing a the sequence for a space of time, enough information can be gathered that will allow the attacker to predict the remaining sequence.

So, the question remains- can you have a computationally strong pseudorandom number generator? The answer is a big resounding YES! Wikipedia to the rescue:

  • The Yarrow algorithm (used in FreeBSD and Mac OS X for both /dev/random and /dev/urandom).
  • The Fortuna algorithm.
  • CryptGenRandom as used in Microsoft Windows.
  • ISAAC which is based on the RC4 cipher.
  • arc4random used in OpenBSD until 5.5, which is also based on the RC4 cipher.
  • AES-CTR DRBG (Deterministic Random Bit Generator) is often used as a random number generator. AES in CTR mode involves incrementing a counter with a random key, and using the encrypted output as the sequence of random numbers. Provided the initial counter and key are kept secret, this is cryptographically secure. In reality, any block cipher in CTR mode will work here.
  • The ANSI X9.17 standard, also adopted by FIPS, explains the use of Triple Data Encryption Algorithm (TDEA) keying option #2 as a CSPRNG.

There are a couple of theoretic designs which are computationally strong, but due to the nature of the algorithm, slow and impractical:

  • Blum Blum Shub relies on the hard quadratic residuosity problem to get its security.
  • Blum Micali relies on the hard discrete logarithm problem to get its security.
  • Dual_EC_DRBG, which has later been revealed to contain a backdoor by the NSA, is based on the decisional Diffie–Hellman assumption.

The CSPRNG in the Linux kernel uses SHA1 on the input pool for handing out our random numbers. The result is also put back into the input entropy pool for further evaluation. Even though theoretical attacks have been made on SHA1, the design is flexible enough that a different cryptographically secure hashing algorithm could be put in its place, such as SHA2, SHA3, WHIRLPOOL, etc.

Non-blocking /dev/random
If you absolutely must have a /dev/random device that doesn't block, either because some software applications are relying on it, such as GnuPG, or because you are working on some informational theroetic algorithm, then there are some options available to you.

The first is the HAVEGE software daemon. This queries a lot more timing events on your motherboard than the standard kernel, and can keep your input entropy pool at full. In my testing, this can produce around 1 MBps of pseudorandom data out of /dev/random. The only drawback, is that it hits the CPU fairly frequently, and can cause the load on the system to jump. However, I have found in practice that if you set the watermark to 512, which is typically considered enough bits of entropy for even the most stringent cryptographic keys, you'll never notice it running.

Other options include purchasing USB devices that use avalanche noise in electronic circuits. Using this USB device, along with rngd(8), can keep /dev/random from blocking, provided that the USB device remains operational. One drawback with these devices, is the capability that the circuit breaks in some fashion, producing biased and highly predictable data. Some implementations, such as the Entropy Key (which is no longer in production) solve this by doing internal bias testing and whitening, at the expense of throughput. These USB devices can be purchased for around $50, and could give you up to 40 KBps of random data per device.

Lastly, some chip makers are introducing a HWRNG (hardware random number generator) on the chip, such as Intel and Broadcom. All it takes to use these generators is by loading a kernel module. Typically, a /dev/hwrng device will then show up, which you again can use rngd(8) to feed its output into the input entropy pool for /dev/random. Unfortunately, it is unknown if the NSA has compromised the integrity of these hardware generators, if the specifications are not available, or an audit has not been performed. Just passing the randomness tests, such as DIEHARD and FIPS are not good enough to discover whether or not the random data is predictable by an attacker. However, throughput can be upwards of 500 MBps for these generators. If you need that sort of throughput, these are the generators for you.

Use /dev/urandom.

Tor Versus Road Warrior

Lately, I have been doing some research regarding Tor, and the technology behind it. Further, I wanted to compare it to other products such as Freenet and I2P. In the process, I stumbled upon this post regarding comparing Tor to a proprietary product called "Road Warrior" from a company called "Cryptohippie". Initially, I tried commenting on the post, but it seems that the WordPress installation at the Daily Anarchist is broken, as every attempt to supply a comment led to a 404 error page. So, instead, I'll post the comment here:

Interesting read. I know the post is more than 3 years old, however I stumbled upon it looking up some alternatives to Tor. Not because I think Tor is bad, but because I'm interested in what the world is doing with the Edward Snowden revelations, and what's out there.

You make some interesting points, mostly with are standing on shaky ground.

1. "Tor is slow". Well, yes and no. Sure, going through 3 random hops to get to your destination is going to add latencies to your round trip connection. That's somewhat expected though. However, with that said, I usually don't have any issue watching YouTube videos, downloading updates from my operating system vendor, or downloading PDFs and other multimedia from the Internet. There have also been updates since this post was published, where the algorithms Tor use have been greatly optimized, lowering latencies, and increasing bandwidth. Personally, I don't generally see any undue latency burdens compared to standard clear net connections. For me, the bandwidth and performance is acceptible.

2. "When someone owns something and generates income from it, they almost always take care of it, and usually work hard to improve it". Sure, that is, when it's financially convenient, which isn't always the case. Many times, companies shift focus, lose investments, or just plain go under, and the product goes with it. We could list example, after example, where commercial products failed, where Free and Open Source products not only survived, but thrived. Especially when it comes to crypto, proprietary and commercial products usually don't fare well. OpenVPN rules the VPN world. OpenSSH rules the SSH world. OpenSSL rules the secure web and email worlds. The standard crypto libraries are all Free and Open Source Software, not proprietary. Even the standard crypto protocols, which Road Warrior relies on, are open not encumbered by patents, such as AES, SHA1/SHA2, etc. Because this is a criticism of "free" software, however, you are aware that there are financial contributions by the EFF to develop the Tor product, just like there are financial contributions from IBM, Oracle, Red Hat, and even Microsoft, to develop the Linux kernel- another "free" software.

3. "Tor may include malicious nodes". Sure. This will be a problem when you use ANY service, and is not unique to Tor. When using any Internet service, there is a level of trust that must be maintained between the provider, and the user. As an example, HTTPS connections could be decrypted on the server, logged, and shared. However, with the 3-random relay node selection, it is actually pretty difficult for a node operator to compromise an entire Tor client connection. In fact, that's kind of the point with the "onion" layer encryption. Really, the only thing that is practical, is controlling both the entry and exit nodes, at which connection profiles could be built based on traffic patterns, IP addresses, times of connections, and sites visited. This is possible by controlling a large number of the relay nodes in existence. As such, entry guards were introduces to minimize this attack. Even DNS, if not properly tunneled through Tor, could give away details about what your doing with your Tor connection. Regardless, no service is 100% secure, and Tor is no exception.

4. "Tor is only for web browsing". Garbage. This couldn't be further from the truth. I use Tor for my email IMAPS/SMTPS connections, for IRC over SSL, XMPP, DNS, SSH, FTP, and so much more. If your application supports using a SOCKS proxy, or if you can setup a transparent proxy for all connections to go through, it just works. Tor is completely agnostic with regards to the protocol. I use email, IRC, XMPP, and web over Tor, all the time, with very little problems.

5. "Tor requires all the software on your computer that accesses the internet to be cooperative. Many programs, however, (whether created by shady marketers, governments, crooks, or just poorly written) are not cooperative, but bypass Tor and give away your network identity". Sure. Many software programs will not allow the ability to change network connections and take advantage of a SOCKS proxy, like your browser or chat clients. However, it's not difficult to build a transparent proxy, to force all software programs over Tor. This can be done at the router, a separate server, or on the local client.

6. "For most people, Tor is to hard to use regularly. This makes security errors and leaks much more likely." This hits the nail squarely on the head, but it's not Tor's fault, and the same thing could be said for Road Warrior. Crypto is just hard. There is a reason for the mantra "don't roll your own crypto". There is a reason many cryptographers have doctorate degrees in mathematics and/or computer science. Crytpo is just exceptionally difficult to grasp. Very little about it is actually intuitive. This is why PGP is not more widely implemented. This is why people don't run encrypted filesystems on their computer. This is why people share their entire lives on Facebook and Instagram. People want things to Just Work, with very little to no work on their part. Unless Tor is 100% transparent, the only people deploying Tor will be nerds and system administrators. Same will be said for PGP, I2P, Freenet, Bitmessage, etc., and even your own product, Road Warrior. Unless cryptography and perfect forward secrecy, among other technologies, are fully 100% baked into the product, people won't deploy it. It has to be just part of the normal way things are done, transparently, such as HTTPS, DNSsec, IPsec, etc. The only thing expected out of people, is the ability to use a browser or an email client. IE- a dumbed down client application. If you expect them to hook in an additional utility, it probably won't happen.

Now, some of my own personal criticisms of Road Warrior.

1. First and foremost, Road Warrior is closed source, proprietary software. This isn't looked upon favorably in the cryptographic community. Especially for a cryptographic tool. I might believe that you're not intentionally being malicious with your Road Warrior product, or have implemented back doors for the NSA, but how do I know you've used encryption standards and best practices? How can the community evaluate that there are no serious bugs, or security vulnerabilities, without access to the source code? How do I know you're addressing CVEs? Just trusting some company, because they say so, is a big leap. Has Road Warrior been independently audited for security? If so, I'm not finding the results anywhere on the site. If the product is to remain proprietary, closed source, then at least provide your customers with the full results of a security audit.

2. All traffic goes through Cryptohippie's servers. With Tor, traffic is split out among a fully distributed, decentralized, voluntary network. Every relay is donated. My traffic might go through Denver on one connection, and Amsterdam on the other. Not so with Road Warrior. While the servers might be geographically separated, customers can't run their own server. As such, the IP list is very static, and makes Cryptohippie a very large single point of failure. Even further, how do I know you're not examining my packets? It's one thing to trust that some random Tor server on the network is not malicious. It's another to trust a company, who is in control of all of the anonymizing servers. Cryptohippie may not be malicious, but what about a disgruntled Cryptohippie employee? Cryptohippie even has an email service. This is a lot of eggs in one basket for a crypto product.

3. Because Cryptohippie is in control of the full stack, Cryptohippie would be aware of a single client and the destination servers they visit. Child porn? Illegal drug trade? Whistle blowing to journalists? Posting to a forum about overthrowing a government? Cryptohippie would know about each of this. I'm sure you're aware, as I am administrating a Tor exit node, that the FBI will make requests regarding illegal content moving across a public IP address. With Tor, the administrator of the exit node is not in control of the content entering and leaving the server. While exit nodes have been seized by governments in the past, no system administrator has been charged with the data communicating over that IP address in question. That is why the hardware is seized, for the possibility of logs or stored data. Requesting client data, such as the originating IP address requesting the data, is not possible, as it's not known. So, with Tor, it's ineffective for the FBI to request an exit node operator to hand over the Tor client data. With Cryptohippie, because that company is fully in charge of the entire network, including accounting details, warrants to release customer information is not only effective, it's easy. If Cryptohippie wishes to remain in business, it must comply with local laws, which includes turning over data to law enforcement.

4. Finally, this may seem petty, but where do your clients go for support? No bug tracker or community forum is present, unless it's behind the login. Looking around your site, I'm seeing copyrights haven't been updated since 2007-2008, which doesn't instill a lot of confidence for me as a client of yours. If your basic website doesn't have an updated copyright, how do I know your Road Warrior software does as well? And if the copyright hasn't been updated in your Road Warrior product, then how do I know you've been addressing bugs and vulnerabilities, introducing new features, or even general maintenance of the software?

I'm sure Cryptohippie is a fine company, and Road Warrior is a fine piece of software. But this blog post glosses over the issues, and is a bit sensationalist, if not a little bit incorrect.

Cryptographically Secure Passphrases In d-note

A couple nights ago, while coming home from work, I started thinking about the button you press on the d-note web application (an instance running at for generating passphrases used to encrypt your note. Each passphrase is a 22-character base 64 passphrase. Initially, I was using the following code in JavaScript:

function make_key() {
    var text = "";
    var possible =
    for(i=22; i--;) {
        text += possible.charAt(Math.floor(Math.random() * possible.length));
    return text;

Simple, functional, works. However, using Math.random() for each character generation isn't cryptographically strong. The d-note web application is known for going over the top on the security engineering, and I know I can do better. So, I did a little digging, and learned about the "Web Crypto API" that many modern browsers support, despite the specification being a "working draft". So, I figured I could use that for my code. As such, the code morphed into the following:

function make_key() {
    var text = "";
    var possible =
    var random_array = new Uint32Array(22);

    // Make some attempt at preferring a strong CSPRNG first
    if (window.crypto && window.crypto.getRandomValues) {
        // Desktop Chrome 11.0, Firefox 21.0, Opera 15.0, Safari 3.1
        // Mobile Chrome 23, Firefox 21.0, iOS 6
    else if (window.msCrypto && window.msCrypto.getRandomValues) {
        // IE 11
    else {
        // Android browser, IE Mobile, Opera Mobile, older desktop browsers
        for(i=22; i--;) {
            random_array[i] = Math.floor(Math.random() * Math.pow(2, 32));

    for(i=22; i--;) {
        text += possible.charAt(Math.floor(random_array[i] % possible.length));

    return text;

The Web Crypto API ensures that browser is using a cryptographically secure non-blocking PRNG from the operating system, such as /dev/urandom that ships with the Linux kernel. While this works, it means that browsers that don't support the Web Crypto API are stuck with non-cryptographic passphrases. This certainly wouldn't do, so I went out to fix it.

Enter Blum Blum Shub (this sounds like something out of a Dr. Seuss book). Blum Blum Shub is a cryptographically secure PRNG, provided a few criteria:

  • The primes 'p' and 'q' can be static or chosen pseudorandomly.
  • The primes 'p' and 'q' should be congruent to 3 modulo 4.
  • The seed 'x0' should be chosen pseudorandomly to start the process.
  • The seed 'x0' should not be '0' or '1'.
  • The seed 'x0' should be coprime to the product of the primes 'p' and 'q'.
  • The GCD of phi(p-1) and phi(q-2) should be small (subjective?), where phi is the Euler's totient function.

Unfortunately, as the product of primes 'p' and 'q' grow, the Blum Blum Shub algorithm slows to a crawl. Fortunately and unfortunately, the maximum integer JavaScript can store is 53-bits, or 2^53. This means the product of 'p' and 'q' cannot exceed that value. That leaves us with an integer space of '2^26' and '2^27' for the primes. However, this is okay, as even 'p = 11' and 'q = 19' generate a cycle that is long enough for our 22-character passphrase.

Thus, the code had changed to:

    else {
        // Android browser, IE Mobile, Opera Mobile, other browsers
        random_array = bbs(22);

This new "bbs(n)" function is the bread and butter for generating cryptographically secure pseudorandom numbers. The function takes an integer as an argument, to know how many numbers need to be generated, and returns an unsigned 32-bit integer array with that many random numbers. The code is as follows:

function gcd(x, y) {
    if(!y) return x;
    return gcd(y, x%y);

function seed() {
    var s = 2*Math.floor(Math.random() * Math.pow(2,31))-1; //odd
    if(s < 2) {
        return seed();
    } else {
        return s;

function bbs(n) {
    // Blum Blum Shub cryptographically secure PRNG
    // See
    var a = new Uint32Array(n);
    // Max int = 2^53 == (2^26)*(2^27) -> (2^p1)*(2^p2)
    var p1 = Math.floor(Math.random()*2)+25; // first power, 25 or 26
    var p2 = 51-p1; // second power
    var p = random_prime(2*Math.floor(Math.random() * Math.pow(2,p1))-1);
    var q = random_prime(2*Math.floor(Math.random() * Math.pow(2,p2))-1);
    var s = seed();

    // Ensure each quadratic residue has one square root which is also quadratic
    // residue. Also, gcd(totient(p-1),totient(q-1)) should be small to ensure a
    // large cycle length.
    while(p%4 != 3 || q%4 != 3 || gcd(totient(p-1),totient(q-1)) >= 5) {
        p = random_prime(2*Math.floor(Math.random() * Math.pow(2,p1))-1);
        q = random_prime(2*Math.floor(Math.random() * Math.pow(2,p2))-1);

    // s should be coprime to p*q
    while(gcd(p*q, s) != 1) {
        s = seed();

    for(i=n; i--;) {
        s = Math.pow(s,2)%(p*q);
        a[i] = s;

    return a;

The key to this function is generating primes 'p' and 'q' that meet the requirements outlined earlier in the post. The primes can be static, with the seed pseudorandomly generated, or the primes can also be pseudorandomly genenerated. I went with the latter. So, the only trick to getting these values, is testing for primality.

The standard way to test for primality is through trial division. Unfortunately, it's slow. Sure, there are a few optimization techniques that you can do to speed things up, but by and large, it's not an efficient way of attacking the problem. Instead, I applied the Miller-Rabin test for composites. Miller-Rabin uses "witnesses" that "testify" about the compositeness of a positive integer. The more witness there are to testify that your integer is composite, the more likely it is. Thus, we would say that we have a "strong composite" integer. However, if there are zero witnesses for a positive integer we can say that the integer "may or may not be prime". However, after enough witnesses, we can actually say with deterministic fact if a number is prime.

I'll stop going over the algorithm there, and let you read up on the Wikipedia article about it. Suffice it to say, the Miller-Rabin primality test runs in O(k*log(n)^3), where 'k' is the accuracy that is sought. This is good news for Blum Blum Shub, as it's slow enough already. Code below:

function modProd(a,b,n){
    if(b==0) return 0;
    if(b==1) return a%n;
    return (modProd(a,(b-b%10)/10,n)*10+(b%10)*a)%n;

function modPow(a,b,n){
    if(b==0) return 1;
    if(b==1) return a%n;
        var c=modPow(a,b/2,n);
        return modProd(c,c,n);
    return modProd(a,modPow(a,b-1,n),n);

function isPrime(n){
    // Miller-Rabin primality test taken from
    // O(k*log(n)^3) worst case, given k-accuracy
    if(n==2||n==3||n==5) return true;
    if(n%2==0||n%3==0||n%5==0) return false;
    if(n<25) return true;
    for(var a=[2,3,5,7,11,13,17,19],b=n-1,d,t,i,x;b%2==0;b/=2);
    for(i=0;i<a.length;i++) {
        if(x==1||x==n-1) continue;
              x=modProd(x,x,n); if(x==n-1) t=false;
        if(t) return false;
    return true;

function random_prime(n) {
    while(!isPrime(n)) n -= 2;
    return n;

The primes 'p' and 'q' must then go through some rigid tests to ensure that the algorithm has a long cycle, and the seed doesn't fall into a trap, where it falls to 0, and the algorithm cannot recover. Part of these tests is using Euler's totient function. The totient function does nothing more than just count what are called "totatives". A "totative" is a number that is coprime to a given number 'n'. In other words, the totative and 'n' do not share any common divisors other than '1'. Every positive integer from 1 through 'n' must be tested. Unfortunately, this is slow. However, there is a proof using a product formula, which shows that we can arrive at the same result in O(sqrt(n)/3) time. Again, because Blum Blum Shub is slow, we need to keep our processor time down.

function totient(n) {
    // compute Euler's totient function
    // O(sqrt(n)/3) worst case
    // Taken from:
    if(n < 2) return n;
    var phi = n;
    if (n % 2 == 0) {
        phi /= 2;
        n /= 2;
        while(n % 2 == 0) n /= 2;
    if (n % 3 == 0) {
        phi -= phi/3;
        n /= 3;
        while(n % 3 == 0) n /= 3;
    for(p = 5; p * p <= n;) {
        if(n % p == 0) {
            phi -= phi/p;
            n /= p;
            while(n % p == 0) n /= p;
        p += 2;
        if(p * p > n) break;
        if(n % p == 0) {
            phi -= phi/p;
            n /= p;
            while(n % p == 0) n /= p;
    if(n > 1) phi -= phi/n;
    return phi;

Putting everything together, the Blum Blum Shub algorithm on my workstation can produce approximately 1 million random numbers per second for a 53-bit space. We only need 22 numbers, so that should be sufficient, even for weaker devices. I was able to test the code successfully on a number of browsers that do not support the Web Crypto API, and the key generation is near instantaneous, even on my phone. Interacting with the console, here is the output of one call to the algorithm:

> bbs(22);
[3143943020, 475278844, 386457630, 124718623, 280175014, 2600881459,
127152064, 749398749, 2269393658, 692609408, 1218408987, 1523732228,
1265360812, 1641372390, 2500929554, 2223592103, 2462017186, 310616491,
426752821, 2973180471, 2248877527, 574751875]

Those numbers are cryptographically strong, because they are the result of the product of two primes, which means using the sequence to get back to the primes is at least as difficult as the factoring problem. It also turns out that it may also be at least as difficult as computing modular square roots, which is at least as difficult as the factoring problem.

So, even though most users visiting a site hosting d-note will likely be using the Web Crypto API, there may be some other browsers that are not. For them, should they choose to have the browser generate the passphrase for encrypting their note server-side, they can rest assured that the result is cryptographically strong.