I just finished reading an article on Ars Technica titled "Ask Ars: Where should I store my passwords?". There was a specific paragraph that I took issue with, which in turn prompted me to write this post. It is:

"Still, it would take thousands of years to crack an 8-character password when checking both small and capital letters, spaces, and numbers. That's on a low-power computer, but the time it takes to crack a string of characters goes up exponentially the more characters you use. So again, use a long password and you can foil even the Watsons of today for long enough that you would probably decide on a whim to change your password before the password is solved."

I guess I should expect more out of them, but I was disappointed, and I want to use this blog post explaining why. If you wish to write an article about password security, regardless of the angle from which you approach it, if you don't mention entropy to your readers, you are doing them a GREAT disservice. Let me rephrase:

__If you don't mention entropy in your article about passwords, you did it wrong.__

If you've taken physics in secondary or undergraduate school, chances are good that you've heard about entropy. If not, you'll learn about it now. Entropy in computer science is very similar to entropy in physics. To put it straight, entropy is defined as the total combination of states that a system can be in. For example, if you have 3 cups, and 4 ping pong balls, how many ways can you arrange all 4 ping pong balls in the 3 cups? Assuming that 4 ping pong balls will fit in 1 cup, and order is not important, you can arrange the ping pong balls 12 different ways: {4,0,0},{3,1,0},{3,0,1},{2,2,0},{2,0,2},{2,1,1},{1,1,2},{1,2,1},{1,3,0},{1,0,3},{0,4,0},{0,0,4} So, this system has an entropy of 12. Entropy in physics is useful to show the Ideal Gas Law, among other things, showing the possible number of states various gases can be in, and it's useful for explaining why some things systems behave one way, but not in the reverse (such as temperature "flowing" from hot to cold, and not the reverse).

It turns out that entropy has a great deal of use in computer science. For example, on most Unix-like operating systems, there is a /dev/random and /dev/urandom device. These devices are useful for extracting random bits to build encryption keys, one-time session keys, seeds for probability outcomes, etc. These devices hold entropy. In /dev/random, for example, environmental "noise" is gathered from the user, such as mouse movements, disk usage, etc. and thrown into an entropy pool. This pool is then hashed with the SHA1 hashing algorithm to provide a total of 160-bits of entropy. Thus, when generating a 160-bit key, the data in /dev/random can be used. If more bits are needed, entropy is gathered by hashing the already hashed bits, as well as gathering additional noise from the environment, and appending to outcome until the number of bits is satisfied. The point is, /dev/random and /dev/urandom are sources of entropy.

So, what does this have to do with passwords? Your password has a certain amount of entropy. This means, that it belongs to a pool of passwords that have the same amount of entropy. The question is, though: "how do you calculate the amount of entropy in a password?" Thankfully, we don't have too think to terribly hard about this one. If you've taken college algebra, the math is pretty straight forward. Entropy in information comes from a branch of probability called "information theory". Any message contains some amount of entropy, and we can measure that entropy in binary bits. The formula for calculating this entropy is:

H = L * log_2(N)

H is the size of the message measured in binary bits. L is the length of the message- in our case, the length of your password. log_2() is the log function, base 2, and N is the number of possible symbols in the password (only lowercase letters provide 26 possible characters, uppercase provide an additional 26 possible characters, the digits provide 10 possible characters and punctuation provides 32 possible characters on an United States English keyboard). I rewrote the equation, so you could find it using your calculator:

H = L * log(N) / log(2)

Having this formula makes calculating the entropy of passwords straight forward. Here are some examples:

**password**: 38 bits (8 * log_2(26)**RedSox**: 34 bits (6 * log_2(52))**B1gbRother|$alw4ysriGHt!?**: 164 bits (26 * log_2(94))**deer2010**: 41 bits (8 * log_2(36))**l33th4x0r**: 46 bits (9 * log_2(36))**!Aaron08071999Keri|**: 131 bits (28 * log_2(94))**PassWord**: 46 bits (8 * log_2(52))**4pRte!aii@3**: 78 bits (12 * log_2(94))

__Question__: what gives you more entropy per bit- length or possible characters? If you passed college algebra, you would know that the answer is length, not total possible characters (if you need to think about this, graph the log function on your calculator, then graph a multiple of it). Of course, you shouldn't ignore using lowercase, uppercase, numbers and punctuation in your password, but they won't buy you as much entropy as length will. Thus, I prefer the term "passphrase" over "password", as it implies this concept to the user.

Here's a table showing the length your password must be given the possible character combinations in your password, if you want a certain entropy. Say you want an entropy of 64-bits using only numbers, it would need to be 20 characters long. If you wanted an entropy of 80 bits using characters from the entire ASCII set, it would only need to be 13 characters long.

Entropy (H) | Numbers | Alphabet | Alphanumeric | All ASCII characters |
---|---|---|---|---|

32 | 10 | 6 | 6 | 5 |

40 | 13 | 8 | 7 | 7 |

64 | 20 | 12 | 11 | 10 |

80 | 25 | 15 | 14 | 13 |

96 | 29 | 17 | 17 | 15 |

128 | 39 | 23 | 22 | 20 |

160 | 49 | 29 | 27 | 25 |

192 | 58 | 34 | 33 | 30 |

224 | 68 | 40 | 38 | 35 |

256 | 78 | 45 | 43 | 40 |

384 | 116 | 68 | 65 | 59 |

512 | 155 | 90 | 86 | 79 |

1024 | 309 | 180 | 172 | 157 |

So, how much entropy should you have in your password? What is considered "strong"? Well, let us look at Distributed.net. They are working on two projects: Optimal Golomb Rulers and cracking an RSA 72-bit message. Let's look at the RSA project. In January 1997, RSA Laboratories issued a secret key challenge. They generated random keys ranging from 40-bits to 128-bits. They provided the ciphertext, and a $1,000 prize to the person who find the private key that generated the message, for every message. In order to know whether or not you found the key, they gave you the first two words of the message.

Currently, as already mentioned, Distributed.net is working on the 72-bit key from that challenge. If you check the stats page, you can see that it has been running for 3,017 days as of the writing of this post, and at the current pace, it would take roughly 200,000 days to search the entire key space. Now, of course it is probable that they will find the key before exhausting the entire space, but knowing when that will be is anyone's guess. Needless to say, 200,000 days or about 540 years at the current pace is substantially large. If they kept that pace up for the 80-bit key, it would take them roughly 140,000 years to search the entire space. However, it only took them 1,726 days, or 4-and-a-half years, to find the 64-bit key, and only 193 days, or 6 months to find the 56-bit key.

So, I think that should give you a good rule of thumb to go by. 72-bits of entropy for your password seems strong enough for the short term, but it wouldn't hurt to probably increase your passwords to contain 80-bits of entropy for the long term. Of course, I don't think I need to mention not using your names, birthdates, or other silliness in your password. That's been beaten to death plenty online. Search Google for generating strong passwords, and you'll find plenty of them (all of which don't mention entropy either, I would be willing to bet).

Now, to be fair, the Ars Technica article mostly mentioned STORING your passwords, not how to create strong ones. I've already written about this before, and I think it's the perfect solution. http://passwordcard.org is the exact solution for storing strong passwords that have a good amount of entropy in them. The great thing about it too, is it does not require any software once generated. It stays with you in your wallet, so if you visit a computer that doesn't have your encrypted database, or you are not allowed to install software on the machine so you can restore your password database, the password card is the perfect fit. It's entirely platform-independent. Just pull it out of your wallet or purse, type in your password, and move on with your life. All you need to remember on the card is three things:

- The starting column and row of your password.
- The length of the password.
- The path your password takes on the card.

I've been using it since it first "released", and I use it for all my passwords on all my accounts, web-based, key-based or account-based. Every password is unique, they all contain more than 100-bits of entropy, and every password follows the "best rules" for creating strong passwords. I've typed many of them enough to have them memorized; I rarely pull out my card (there is also an Android and Apple application if you want).

So, next time you read an article about password strength, do a quick search for the word "entropy". If it's not mentioned, take the article in stride, or at least notify the author of this post, and that they should discuss entropy to their readers. Entropy is your friend.

## { 26 } Comments

Very very nice article. Do you know a way to calculate lack of entropy when people use easily guessable terms? I know people who use 17 character passwords but they end up being something like BobGeorgeSmith123 where you can just take the words, capitalize them, and then add a number to the end. So it looks good, but if you knew the persons father was Bob George Smith its entropy is a lot smaller (though not 1). Or is this not really easy to calculate?

The entropy has nothing to do with the simplicity of the password, only the length and the characters that it uses. If a password is easy to guess, that just means that there is a good starting point in the search space. But, in your example, the search space would still be 101-bits.

I thought that passwords (particularly in *nux scenarios) where hashed in such a way as to give them all (on any given system) the same entropy...

But now I come to think about it, that just mitigates against decryption of the stored password, not the the time it would take to brute force guess your password. But if we are talking the time to brute force guess your password then we must take into account how long it takes process all the possible passwords (i.e. how long the computer/system being attacked takes, and what delay is introduced to deliberately stop or slow down brute force attacks).

Basically, so long as your password is not easily guessable, I don't see that the entropy of the password matters that much, its the entropy of encrypted stored form that matters, and for the most part that is outside the hands of the person using the password.

Hashing your password doesn't add entropy to the password. If your password is 80-bits worth of entropy, and you hash it with SHA1, it's still only worth 80-bits of entropy. If you're talking about breaking the salted hash in /etc/shadow directly, then you're breaking a message with 160-bits of entropy. BUT, you're attacking the hash, not the password. This is why password cracking utilities, such as John the Ripper, use dictionaries to get to the password. It's much easier to search a total space of 80-bits than 160. Take a password, hash it, and compare it to the stored hash. You're attacking the password, not its hash.

Also, I made very clear in the post how long it takes to break 72-bits worth of entropy, as well as 64 and 56 given a sufficient attack. Entropy matters a great deal. There are more algorithms than brute force for getting at a password, you can count on that. So, take the post for what you will, but if your password doesn't have a sufficient amount of entropy, don't come crying to me when your account gets compromised.

yes, I meant salting.

I don't get it.

You can attack via 2 vectors, either get all (or a substantial amount) of the stored passwords and attempt to decrypt them, or the 2nd vector, brute force a single password.

The first vector (decrypt a whole host of passwords) the entropy of the individual passwords is immaterial since decryption is set by the entropy of the salting.

The second vector (brute force) the entropy is nowhere near as important as hard to guess, reason being, most systems introduce at the very least delays into a brute force attack, so much so, that the artificial delays introduced far outweight the time taken to guess the next possible password.

So, in the real world, other factors seem to be far more important than the entropy of the password. The only time I see the entropy of the password being the important factor is when the system you use the password on being extremely poor with its security anyway.

My password system until today:

- commit everything to memory;

- low entropy;

- a lot of different passwords, but I still re-used some of the passwords;

- I sometimes mistakenly try to log into website X using password Y... if X keeps logs of passwords used during failed login attempts, they have a wealth of information that can be used to log into Y etc;

- almost never change passwords;

- plain text list of passwords stored on my computer;

- etc...

Reading the recent HGBary scandal, today's Ars piece, your post here, and your earlier post on the password cards, and a lot of comments, I just completely changed how I do passwords:

- KeePass;

- high entropy (every possible character except for spaces and up to 12 characters long).

I still need to work on a better master password, but at least I now have the infrastructure in place

In a comment to your earlier post, you wrote:

"The problem with KeePass is its lack of perfect portability. While it can install on any of the major operating systems, it requires that you have access to the software. Inevitably, you are going to access your parent’s computer, the computer at the library, school, or some other public establishment, and you might not have permission to install KeePass to get access to your DB (provided you can). If you NEVER touch another computer, this issue is moot. If you do, it can be a headache, if your passwords are actually strong (read: at least 72bits of entropy)."

I don't think portability is much of a problem if I store my database on dropbox, and if I always have my KeePassDroid enabled phone with me... I often forget to take my wallet (which, I admit, is stupid), but I never forget to take my phone!

(I will have to read your post again to understand all of the math... Yes, IAAL.)

wow, this is seriously interesting. Thanks Aaron for the info

Fantastic article. Thank you for writing it.

@Jason- Now I'm not sure if we're both arguing the same point, or not. So, let's start over.

If I don't have access to the shadow file, which means I don't have access to the hash (and the salt, if any), then I have a few options to get at the password- brute force, SQL injection, buffer overflows, keyboard loggers, and other means of attacking a server to reveal the information I'm after. However, because we're discussing the strength of a password based on the entropy, let's look just a brute forcing. I have at least three clear methods of attack:

1. I can increment through the letters, numbers and punctuation, starting with the shortest word and working my way to the longest.

2. I can use a dictionary attack.

3. I can exhaust entropy bit spaces.

Let's look at option 2. This seems to be an appropriate means of attack, as passwords are generally based on some sort of dictionary word, whether they just append numbers at the end (or prepend at the beginning, or both), they use "leet speak", or they just plain don't care that much. At any event, exhausting massive dictionaries seems like a good use of time.

But what happens when all of our dictionaries are exhausted, and we still haven't gotten to the password? We could take the route of option 1, and increment until the cows come home, or use option 3, and start exhausting entropies.

I think it would be clear that option 1 would take far too long. Maybe I'll luck out, and the password is 'aaaa', but probably not. It could be something like "l33th4x0r", which has 46 bits of entropy. Of course, on a single-purpose computer, it's reasonable for me to exhaust everything up through 40 or maybe even 48 bits of entropy. Because his password is in that search space, I can get at it quickly. But, what if the password had 80 bits of entropy? Even with a distributed attack on the password, it would be infeasible to find the password in such a large search space. We haven't even talked hashing the password. We are just brute forcing the account, hoping to find a match.

So, let's look at the hash then. It could be hashed with something weak like MD5 or it could be a salted and iterated hash with SHA512. Either way, if I have access to the hash, I likely also have access to the salt, so I've just reduced my search to the length of the hash itself. BUT! I am NOT breaking SHA1 or whatever hashed the password! I'm breaking the password itself. I'm not interested in the entropy SHA1 provides to any message. I'm only interested in the message that produced that hash.

Again, I'm going to start using a dictionary attack or exhaust entropies until I find a word that hashes to the same value as what I have found. Of course, if I have the hash, I don't need brute force. I could use Rainbow Tables, birthday attacks, and a number of other methods at getting at the password. Brute force should be my last option after all previous, and more effective attacks, have failed.

Regardless of the attack method used, the more entropy you have in the pasword, the longer it is going to take to break. Point blank. Whether it's iteration, dictionary attacks or exhausting entropy. The more entropy you have in your password, the stronger your password is. There's just no way around it.

@Blimundus- I still contend that software-based password storage solutions are not optimal. In your case, you have your DB on your phone. What happens if you lose your phone, only to find that you don't have a backup? What happens when the developer of the software stops pushing updates? What happens when you are at your friend's computer, and your phone battery is dead?

Too many "what ifs" prevent me from using anything that relies on power, synchronization or other forms of computing technology. The card is perfect, because it's stored in plain text right on the card, yet if an attacker gets your card, he has to know where the password starts, it's length, and the path it takes. The combination of these variables produce an infinite amount of combinations, making it infeasible to attack your account.

And, if you lose your card, hopefully you wrote down the number that generated your card, so you can go back to the page, type in your number, and reprint your card. To me, it's a nobrainer. No Dropbox. No phone. No synchronization mess. Just me and my wallet, which is probably the most tracked possession I own, outside of my kids.

Don't take me wrong. I'm glad KeePass and Dropbox work for you. I personally just want to be as platform-independent as possible, and the password card allows me to do this.

Further, I'm glad you evaluated your current password situation, and decided to do something about it. Too many people, I fear, don't bother reading the article, or if they do read it, don't do anything about it. As a result, we will continue to see hacked Gmail and Facebook accounts. So, congratulations on actually doing something about your security. Well done.

Quote: "Assuming that 4 ping pong balls will fit in 1 cup, and order is not important, you can arrange the ping pong balls 12 different ways: {4,0,0},{3,1,0},{3,0,1},{2,2,0},{2,0,2},{2,1,1},{1,1,2},{1,2,1},{1,3,0},{1,0,3},{0,4,0},{0,0,4} So, this system has an entropy of 12."

The entropy of a system in which all states, of number N, are equally likely, is given by S = kB log(N), where kB is the Boltzmann constant (see classical statistical physics). Most importantly, it grows logarithmically in N, not linearly.

Yes. You'll notice that in Information Theory, message entropy grows at the same rate, except where Boltzmann's constant is found, we find the variable L for the length of the message. But yes, entropy does indeed grow logarithmically.

Your explanation of entropy is seriously lacking. In particular, the formula L log N only applies if all characters are chosen independently and uniformly at random. Clearly, the claim that the password "password" is highly misleading, and when it comes to security, misleading equals dangerous. (Technically, the claim that any fixed sequence has interesting entropy is problematic.)

The point is that good passwords are chosen randomly. Great suggestion with the password card, by the way!

@Nicolai Hähnle: I'm not misleading anyone. Entropy is merely a measurement of size, not security. It's a definition, and that's it. I only mention the security of the sheer size of various entropies. However, what I've mentioned here isn't the End All of entropy. Not by a long shot. Sure, you can measure the security of a password based on the seed used to build the random arrangement of characters, but that has nothing to do with entropy, other than after you have your string, it belongs to a certain entropy pool.

Calling my post dangerous is just draconian.

Time to rethink my password strategy!

I see the passwordcard site also has an encrypted site. Maybe a good idea to change the link to the https-version?

Yes, there is an encrypted version of the site, but I won't link to it, mainly due to the fact that I don't want to add additional stress to his server. If you are that concerned about the image going over the wire in "plain text", then as you noticed, you can clearly go secure yourself. If the developer wasn't concerned about HTTPS adding strain on his server, then I would gather that it would be default.

Isn't the entropy of the password card significantly less than the entropy of the passwords on it? I mean, you have a finite number of starting positions and a finite number of lengths. Aren't you essentially defeating the purpose of using a high entropy password by having the data on a card?

Who said you have a finite number of lengths? I look at the card, I can find a password with 1,000,000 characters. Practical, no, but finite? Heh. You have an infinite number of characters in your password from the card, even if they repeat. You also have an infinite number of paths to take from an infinite combination of choices.

"What happens if you lose your phone, only to find that you don’t have a backup?" > Dropbox syncs the passwords file with my computer, and I have daily/weekly/monthly incremental backups from my computer to an external harddisk.

"What happens when the developer of the software stops pushing updates?" > KeePassX is in the Debian repositories, and as long as it is in there, I can read my password file. If development on KeePassDroid stops, I may not be able to continue to use this password management system on my phone, and I will look for alternatives. If Dropbox stops, I can easily find another file syncing solution.

"What happens when you are at your friend’s computer, and your phone battery is dead?" > friend's computer = power source. Now I just need to find a standard mini USB adapter and I can power my phone.

"So, congratulations on actually doing something about your security. Well done." > Thanks! I'm glad I finally took action. Almost as good as when I finally started doing backups... (I admit I still have to test recovering data from the backup... it worked on day one though).

By the way, quite the jump from Chrome 9 to Chrome 11. I know, I am sending this from Internet Explorer 8.0... Our office pc's all run Citrix, and the applications run in some off-site datacenter. We actuall switched from IE 6.0 to IE 8.0 only a couple of months ago. That was a great day: I could finally use tabs! :-s

I have various computers that I access. My personal laptop is running Debian Sid, which I'm using now. I have a Fedora 14 virtual machine, which I use at work that has Chrome 11.

If you're going to fault them for not mentioning word "entropy" [while still hitting most of the other key points], you ought to use it correctly yourself. What you're describing isn't entropy, it's a simpler keyspace-size concept [which is probably the upper bound of entropy]. In other words, you don't really address why "password" has less entropy just because it only contains lowercase letters. I mean, lowercase letters are letters, they are alphanumeric, and they are ASCII. Isn't that just "a good starting point in the search space"? On the other hand, it only _actually_ contains seven different characters, so why isn't it 8*log_2(7)? And anyway, log_2(`wc -l /usr/share/dict/words`) is probably a better answer anyway.

Entropy is basically nothing _but_ an approximate measurement of how much [well, how little] a password is "a good starting point in the search space" - the "search space" here consisting of all possible strings from the empty string to a string 9999 characters long (or whatever your system limit is). And "password" probably doesn't have 38 bits to begin with even ignoring it being a common first guess - english text only averages a little under three bits per letter, not the almost five implied by log_2(26). Which means that passphrases comprising intelligible sentences have to be a LOT longer than randomly generated passwords to have the same entropy.

And the keyspace size for your card gets to be a lot less if someone gets your card (I hope that's not your real card) - your method is basically a dictionary-word password relying on a dictionary no-one else has.

@Random832- I am using entropy correctly. It's a definition, and nothing more. Check out the Wikipedia article. It's the maximum number of states that a system can be in. It is an upper bound. This is discussed in the post. And yes, passwords are just a subset of the ASCII set of characters, however, if you know a site restricts length or type of characters, then you have your starting point. Again, entropy is nothing more than a definition.

Also, the password card has an infinite searchable keyspace. I don't understand why people don't get this. Your password can be of infinite length and take any infinite amount of turns or directions. Sure, it has a subset of the the full ASCII set, so the possible number of characters is smaller, but as discussed in the post, it's length that gets you entropy, not the total number of characters. Length is key, and length is in the card.

Thanks for the article. I know it's old but I found it interesting. At the same time I have to disagree. You said it yourself:

"what gives you more entropy per bit- length or possible characters? If you passed college algebra, you would know that the answer is length, not total possible characters"

The only factor that really matters is the length of the password/phrase. This is because the cracker has no idea how much entropy is actually in the password, only the bit length. In order to crack it they have to assume maximum entropy and still try every possible combination within the bit length.

So, yes, entropy is important mathematically, but since the cracker is dealing with an unknown variable they have to assume the worst.

So, what are you disagreeing on? I'm not following. As mentioned in the article, length will give you more entropy for the cost, than fancy-pants uppercase, lowercase, numbers, symbols stuff. It's not say that they aren't important. They are. Very much. But, when it comes right down to it, attackers are looking for a needle in a haystack. They don't know the length of your password, and they don't now what sort of character sets you're using. All they likely have, is a SHA1 hash of your password and maybe the salt. They likely don't know anything else. Your password could be 1200 characters, it could be 12. It could be all lowercase, it could be some random leetspeak. Regardless, your needle is in a haystack. So the question remains: how large is the haystack they are looking through?

Hi, I'm quite a noob in all that stuff but I found your article very interesting and relevant compared to other articles that says all the things you say they're not right.

But I wonder, how do you know that a given password is x binary bits ?

I enjoy doing some petty programming and I'd like it very much if I could do my own password checker...

Would you enlight me, please ?

Bests regards,

lem.

Read the post. In there is an equation that calculates the binary bits a message contains, if the message is truly random.

## { 5 } Trackbacks

[...] post about password safety and [...]

[...] of the following rules to achieve the desired level of entropy, which at current standards is a minimum of 72 bits. That’s assuming that everything in the phrase is truly random, though. In reality there are [...]

[...] a password with 40 bits of entropy will create a hash with 40 bits of entropy if it is unsalted. This is the only source I could find where it was stated as fact. Is this correct? Thank [...]

[WORDPRESS HASHCASH] The comment's server IP (74.86.156.59) doesn't match the comment's URL host IP (75.126.162.205) and so is spam.

[...] Si no sabes de qué estoy hablando pero te interesa el tema, intenta leer algo sobre entropía en las contraseñas. Algo como la entropía como una medida de la robustez de las contraseñas, o las passwords robustas NECESITAN entropía. [...]

[...] you find the needle? Of course, the larger the haystack, the harder it will be to find the needle. I have also blogged about this in the past. Thankfully, Gibson Research Corporation has put together a web application that uses this analogy. [...]

## Post a Comment