**Introduction**

For those who follow my blog know I have blogged about password security in the past. No matter how you spin it, no matter how you argue it, no matter what your opinions are on password security. If you don't think entropy matters, think again. Entropy is everything. Now, I've blogged about entropy from information theory in one way or another. Just search my blog for the word "entropy". It should keep you occupied for a while.

I'm not going to cover entropy here. Instead, I wish to put my focus on attacking the passwords, rather than building them. Entropy is certainly important, but let's discuss it from the perspective of actually using it against the owner of that password. For simplicity, we'll assume best-case scenarios with entropy. That is, we'll assume that even though the human has influence to weaken the entropy of their password, we'll assume it hasn't been weakened. Let's first start with the brute force attack, which is the most simplistic, although most inefficient attack on passwords.

**Brute Force**

Brute force is a dumb search. Literally. It means we know absolutely nothing about the search space. We don't know the character restrictions. We don't know length restrictions. We don't know anything about the password. All we have is a file of hashed passwords. We have enough data that if I use a utility like John the Ripper or Hashcat, I can start incrementing through a search space. My search may look something like this:

a, b, c, ..., x, y, z, aa, ab, ac, ..., ax, ay, az, ba, bb, bc, ..., bx, by, bz, ...

This is only encompassing lowercase letters in the English alphabet, but it illustrates the point. I start with a common denominator, and increment its value by one until I've exhausted the search space for that single character. Then I concatenate an additional character, and increment until all possibilities in the two-character space are exhausted. Then I move to the three-character space, and so forth, as fast as my hardware will allow.

On the United States English keyboard, there are 94 printable case sensitive characters. Because this is a brute force search, we can make very little assumptions about our search space. We can assume a lot of non-printable characters will not be valid, such as the BELL, the TAB and the ENTER. In fact, looking at the ASCII table, we can safely assume that ASCII decimal values 0 through 31 won't be used. However, it is safe to assume that decimal values 32 through 126 will be used, where decimal value 32 is the SPACE. This is 95 total characters for my search space.

**Hashing Algorithms**

Before we get into speed, we must also address hashing. Many password databases will hash their passwords, and we're pretending that I have access to one of these databases offline. Most of these databases will also store the salt, if the password is salted, and the algorithm of the hashing type, if it supports more than one hash. So, let's assume that I know the hashing algorithm. This is important, because it will play into how fast I can brute force these passwords.

A designer of a password database store should keep the following things in mind:

- Slower algorithms, such as Blowfish and MD5, will hinder brute force attacks. However, it can also be slow for busy server farms.
- Algorithms such as SHA256 and SHA512 should be used when possible due to their lack of practical crypanalysist attacks.
- Rotating the password thousands of times before storing the resulting hash on disk makes brute force attacks very, very ineffective.

What do I mean by "rotating passwords"? Let's assume the password is the text "pthree.org". Let's further assume that the password is being hashed with SHA1. The SHA1 of "pthree.org" is "bea133441136e7a7c992aa2e961002c170e54c93". Now, let's take that hash, and use it as the input to SHA1 again. The SHA1 of "bea133441136e7a7c992aa2e961002c170e54c93" is "cec33ee7686fc2a91b167e096b1f3182ae7828b7". Now let's take that hash, and use it as the input to SHA1 again, continuing until we have called the SHA1 function 1,000 times. The resulting hash should be "4f019154eaefe82946d833f1b05aa1968f1f202a".

A basic Unix shell script could look like this:

1 2 3 4 5 6 | #!/bin/sh PASSWORD="pthree.org" for i in `seq 1000`; do PASSWORD=`echo -n $PASSWORD | sha1sum - | awk '{print $1}'` done echo $PASSWORD |

MD5 is a slow(er) algorithm, but it is also horribly broken, and should be avoided as a result. SHA1 is faster than MD5, but still slower than others. Practical cryptanalysis is approaching, and should probably be avoided. However, Blowfish is exceptionally slow, even slower than MD5, and has not shown any practical attack against the algorithm. So, it makes for a perfect candidate for storing passwords. If the password is hashed and rotated 1,000 times before storing on disk, this cripples brute force attacks to an absolute crawl. Other candidates, such as SHA256, SHA512 and the newly NIST approved SHA3 algorithms are much faster, so their passwords should definitely be rotated 1,000 or even 10,000 times before storing to disk to slow the brute force search.

**Raw Speed**

Now the question remains- how fast can I search through passwords using a brute force method? Well, this will depend largely on hardware, and as just discussed, the chosen algorithm. Let us suppose that I have a rack of 25 of the latest AMD Radeon graphics cards. According to Ars Technica, I can achieve a bind blowing speed of 350 BILLION passwords per second if attacking the NTLM hash used to store Windows passwords. If every password in the database was 8 characters in length, then my search space is:

95 * 95 * 95 * 95 * 95 * 95 * 95 * 95 = 95^8 = 6,634,204,312,890,625 passwords

Six-and-a-half QUADRILLION passwords. That should give me a search space large enough to not worry about, right? Wrong. At a crazy pace of 350 billion passwords per second:

6634204312890625 / 350000000000 = 18955 seconds = 316 minutes = 5.3 hours

In just under 6 hours, I can exhaust the 8-character search space using this crazy machine. But what about 9 characters? What about 10 characters? What about 15 characters? These are a legitimate question, and why entropy is EVERYTHING with regards to your password and the search space to which it belongs. Think of your password as a needle, and entropy as the haystack. The larger your entropy is (total number of possible passwords), the harder it will be to find that needle in the haystack.

**The Exponential Wall**

Let's do some math, and explain what is going on. I'll give you a table of values, showing passwords that start with 8 characters through 15 characters in the first column. In the second column will be the search space, and in the 3rd column will be the time it takes to fully exhaust that search space. Before continuing, remember that it may not be necessary to exhaust the search space. Whenever I find the password, I should stop. There is no need to continue. So, I may find that password very early in the search, and I may find it very late in the search, or anywhere between. So, the 3rd column is merely a maximum time.

Length | Search Space | Max at 350 bpps |
---|---|---|

8 | 6634204312890625 | 5.3 hours |

9 | 630249409724609375 | 20.8 days |

10 | 59873693923837890625 | 5.4 years |

11 | 5688000922764599609375 | 5.1 centuries |

12 | 540360087662636962890625 | 48.9 millenia |

13 | 51334208327950511474609375 | 4,650.1 millenia |

14 | 4876749791155298590087890625 | 441,830.6 millenia |

15 | 463291230159753366058349609375 | 41,973,910.1 millenia |

What's going on here? By nearly doubling the length of my password, my search space went from 5 hours to 41 million millenia. We've hit what's called "the exponential wall". What is happening is we are traveling the road of 95^x, where 95 is our character set, and "x" is our length. To put this visually, it looks exactly like this;

As you can see, it gets exceptionally expensive to continue searching past 9 characters, and absolutely prohibitive to search past 10, let alone 15. We've hit the exponential wall. Now, I've heard the argument from all of my conspiracy theorist friends that faceless nameless government agencies with endlessly deep pockets (nevermind most countries are practically bankrupt), who have access to quantum supercomputers (nevermind quantum computing has yet to be implemented) can squash these numbers like the gnat you are. Let's put that to the test.

Just recently, the NSA built a super spy datacenter in Utah. Let's assume that the NSA has one of these spy datacenters in each of the 50 states in the USA. Further, let's assume that they have not only 1 of these GPU clusters, but there are 30 of them in each datacenter. That means we have 750 GPUs in one datacenter, which means they have 37,500 total across the country. That's 13 quadrillion NTLM hashes per second (assuming the communication between all 37,500 GPUs across all 50 states does not add any latencies)! Surely we can exhaust the 15 length passwords with ease!

I'll do the math for you. It would only take 5 days to exhaust the 11 character length passwords, 1.3 years for 12 character passwords, 1.2 centuries for 13 character passwords, 11.8 millenia for 14 character passwords and 1,119.3 millenia for 15 character passwords. Again, we've hit that exponential wall, where 11 characters is doable (although exceptionally expensive), 12 characters is challenging, and 13 characters is impractical. All this horsepower only bought us 2 extra characters in the brute force search using only 25 GPUs. For an initial pricetag that is in the billions, if not trillions for this sort of setup (not to mention the power bill to keep it running as well as the cooling bill), it just isn't practical for a nameless faceless government agency. The math just isn't on their side. It just doesn't pencil in financially, nor does it pencil in theoretically. Such much for that conspiracy.

**Conclusion**

As is clearly demonstrated, truly random 12-character passwords would be sufficient to thwart even the most wealthy and dedicated attackers. After all, you hit that exponential wall. Surely, all bets are off! Not so. There are plenty of additional attacks that the attacker can use, which are much more effective than the brute force, and which makes finding a 12-character password more feasible. So, we'll be discovering each of those in this series, how effective they are, how they work, and software utilities that take advantage of them.

Stay tuned!

## { 13 } Comments

Thanks Aaron, for the first time, it's easy to see why longer passwords make the most sense.

A few questions came to mind while reading this.

I don't think I've heard anyone describe MD5 as slow in the general sense.

A ran a simple PHP script to confirm this, I was able to do roughly 1,000,000 md5 runs in 0.5 seconds on my Macbook Air. Using Blowfish with a cost of 04 (the lowest value supported) I was only able to do roughly 450 runs in 0.5 seconds.

Raising the cost to 08, which is the default for things like phpass and WordPress slowed things down dramatically. I was only able to do roughly 30 runs in 0.5 seconds.

Is there a difference between what you are describing and key stretching?

The numbers you described were specifically for NTLM. It would be interesting to see how that compares to other algorithms.

Good point about MD5. I meant it in comparison to SHA1, SHA2 and the newly born SHA3. I have been interested in doing a hashing algorithm benchmark post, but the more I dive into it, the more involved it becomes. There are so many factors on both hardware and software, interpreted vs compiled vs assembly, configuration options, and output size, it's a beast to tackle. If I'm going to post it, I want to be right, and I just don't have the confidence right now to do that.

Regarding key stretching, that is precisely what it is. I avoided using the term, because I wanted people to understand exactly what was happening. For the newcomer, "key stretching" seems like you're physically lengthening the password, which of course you aren't. Rotating the password, however, makes a bit more intuitive sense. However, there are some subtle differences- key stretching can include concatenating the result with the original password at each SHA1. It could also involve concatenating a salt with that operation as well. Here, with the term "rotating", I'm specifically referring to using the previous output as the new input.

Just checking whether pthree.org recognizes IE10 on Windows Phone 8.

Why?

Forum thread on 1Password is some what related to the topic at hand - http://hashcat.net/forum/thread-2238.html

An SHA1 sum hashed 1000 times is the same SHA1 sum?

I'm not sure I'm following. If I think I understand what you're asking, a SHA1 output is a 160 bit string. So, the initial password may be less than this, but when hashed with SHA1, it will be that exact length. SHA1 certainly operates faster on smaller inputs than larger inputs, but we're talking maybe a couple dozen bits at most. The difference in execution speed won't be noticeable. Now, if you fed it a 10 GB password, the initial hash would be very slow, but after that, the rest of the execution would be operating on 160-bit strings, and would finish quickly.

Good writeup.

I usually use two or three common words linked with dots or asterisk as passwords. Those simple words combined create very long strings.

Your explanation here gives me one more reason to continue using them.

Hi,

Aaron, I think I find a ,mistake (propably during copy&paste) ;

"The SHA1 of “pthree.org” is “4f019154eaefe82946d833f1b05aa1968f1f202a”"

It actually is bea133441136e7a7c992aa2e961002c170e54c93 .

This is the same value as you stated after running sha 1000 times, that's why I decided to check it. 🙂

Fixed! Thanks!

A strong encryption depends on:

1. Large keyspace (keysize).

e.g. 128-bit or 256-bit.

2. The encryption should be able to actually provide such keyspace, shortcuts are not allowed.

e.g. AES (best cryptanalysis reduced 128-bit AES to 126-bit but still good enough), or TwoFish

3. The chosen encryption key should have as much entropy as the keysize.

e.g. 123456 for 256-bit AES is almost 0-bit encryption, not a 256-bit encryption at all.

However, in the real world, a passphrase instead of a random key is often used (e.g. full disk encryption), because humans are not able to memorize long random bytes.

So, in my understanding, no matter what type of key-derivation algorithm, or hash algorithm you use, to slowdown the brute-force attack, even if you can make it impossible, a 128-bit full disk encryption, protected by a passphrase with 70-bit of entropy, is ultimately still 70-bit encryption, am I correct?

Aaron, you use the term "search space" to describe the number calculated by:

95 * 95 * 95 * 95 * 95 * 95 * 95 * 95 = 95^8 = 6,634,204,312,890,625 passwords

I understand that to be "keyspace," the set of all possible permutations at a given length: 95^8. Whereas "search space" would be the total number of all possible permutations up to and including the given length: 95^1 + 95^2 + 95^3 + 95^4 + 95^5 + 95^6 + 95^7 + 95^8.

Please forgive my ignorance, and correct my understanding if I'm wrong.

## Post a Comment