Image of the glider from the Game of Life by John Conway
Skip to content

OpenPGP Key Random Art, Now With ANSI Color Support

I just recently committed support for my OpenPGP key random art Python script to support ANSI color. The idea is to create a "heat map" of which squares the drunken bishop has traversed during his dizzying travels. So not only can you see what your key "looks" like, but now you can sense what your key "feels" like.

The idea is simple- the squares that have not been walked will not have any color applied to them. The squares on the floor with the fewer visits will be "cold" with the blue color, while squares that have been visited many times will be "hot" with the red color, approaching white hot for even higher frequencies. There is a direct one-to-one relationship with the coin and the color. The point isn't necessarily to introduce additional entropy or to reduce collisions, but merely for entertainment value. For PGP key signing parties, standard plain text output should be the default, preventing anyone from being pigeon holed into specific software (browsers, PDF readers, etc).

Right now, the Python script is under very active development, so there are many things missing, such as switches to turn on foreground and background colors, or enabling and disabling ANSI entirely. For the time being, ANSI is default, and the script will fallback to plain text if the TTY or STDIN does not support ANSI escape sequences. The next step is to start working on command line switches. I haven't decided if ANSI color should be the default or not.

Necessary screenshots with the heat map and a key chosen at random from my keyring displayed, both for foreground and background colors are below. Note: the switches are missing in the command to differentiate between foreground and background display. I changed the code manually between the screenshots. Also, I might adjust the heat map, adding one or two extra pink to white colors, bringing red closer to blue, so a larger variety of color can be displayed in the ANSI.

Screenshot showing ANSI foreground color for an OpenPGP key.

Screenshot showing ANSI background color for an OpenPGP key.

Gunnar Glasses Review

I've owned a pair of GUNNAR Optiks for a couple of months now, and figured it would probably be a good time for a review. For those unaware, GUNNAR Optiks are specially designed eyewear for long exposure computer use. The claims on the main site are:

  • Eyes are more moist.
  • Headaches are reduced.
  • Blurry vision is reduced.
  • Eyes are no longer fatigued.

They claim these points are met by the curvature of the eyewear and the unique color of the yellow lens. I'll review each individiually.

The curvature of the glasses brings the lens in more closely to the face. The GUNNARS claim is that your eye produces natural moisture that humidifies the immediate air surrounding the eye. By bringing the lens in more closely to the face, that humidity is trapped in the area surrounding the eye, preventing the eye from drying out. So someone who does not wear glasses regularly, I was skeptical of this claim. However, after wearing them for 8 hours per day, every day, I can most certainly feel that my eyes are much more moist than without. People who regularly wear glasses are probably already aware of this. In fact, I've been fighting allergies this season, and my eyes have been abnormally moist. Wearing the GUNNARS has actually complicated the problem, to where I am constantly wiping my eyes to remove the excessive tearing.

The yellow color tint of the lens claims to neutralize the high-intensity blues that artifical light provide, such as flourescent and LED bulbs common in computer monitors. This actually didn't surprise me all that much, as the lens of your eye yellows as you get older due to UV radiation from the sun. On the color wheel, yellow is complementary to blue. Complementary colors of light, when combined correctly, produce white light. So, using a yellow tint to neutralize a blue light, providing a more balanced color spectrum is something photograhpers have known for a long time.

Due to the yellow tinting of the glasses, the claim is that headaches, fatiqued eyes and blurry vision are reduced or eliminated. I was highly skeptical of these claims, but after wearing them for a full day, I have noticed how much more "comfortable" my eyes feel at the end of the day. Indeed, I do not have the typical eye fatique that I normally have when driving home. I've never encountered headaches working in front of a computer all day, but I do get blurry vision, and the GUNNARS have most certainly reduced it. Even my wife has noticed how less grumpy and fatiqued I am when coming home after a full day of work.

After wearing the GUNNARS for the past couple of months, I've concluded that they do indeed hold up to the claims they make. There is something there to the science behind the eyewear. In fact, I've come to prefer wearing them when watching movies and television as well. Even when sitting in front of this 17" monitor writing this post without wearing the glasses, I am beginning to notice eye strain. I do feel like I look a bit kooky when walking around the office with them on my face, but they work, and they work well.

All isn't positive, however. The GUNNARS I purchased were from a friend who was getting migrains as a result of wearing them. The pair is non-prescription, so other than wearing typical sunglasses or ski goggles, I don't understand how changing the tint of what his eye sees would cause a migrain, but it was certainly the case for him. For myself, my life has been improved, not degraded by wearing the GUNNARS.

If someone were to ask me if I recommend wearing GUNNARS for gaming or general computing, it would certainly be in the affirmative.

Analysis of RIPEMD-160

Recently on Hacker News, I noticed a table showing the "Life cycles of popular cryptographic hashes" by Valerie Aurora (in this post, I've greatly compressed her HTML for faster page delivery).

Life cycles of popular cryptographic hashes (the "Breakout" chart)
Function 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
Snefru                                              
MD4                                              
MD5                                              
MD2                                              
RIPEMD                                              
HAVAL-128                                              
SHA-0                                              
SHA-1                                              
RIPEMD-128 [1]                                              
RIPEMD-160                                              
SHA-2 family                                    [2]          
SHA-3 (Keccak)                                              
Key Unbroken Weakened Broken Deprecated
[1] Note that 128-bit hashes are at best 2^64 complexity to break; using a 128-bit hash is irresponsible based on sheer digest length.
[2] In 2007, the NIST launched the SHA-3 competition because "Although there is no specific reason to believe that a practical attack on any of the SHA-2 family of hash functions is imminent, a successful collision attack on an algorithm in the SHA-2 family could have catastrophic effects for digital signatures." One year later the first strength reduction was published.
The Hash Function Lounge has an excellent list of references for most of the dates. Wikipedia now has references to the rest.

I find this table a great resource, and I'm glad she put it online. However, I have one small issue with the table (other than it's out of date, and small on functions), and that's her calling RIPEMD-160 "deprecated". My first question would be: deprecated by whom exactly? RIPEMD-160 isn't a FIPS standardized cryptographic hash function, so it couldn't be deprecated by NIST. RIPEMD-160 was actually developed in Belgium, and as far as I can tell, the Belgium NBN - Bureau for Standardisation hasn't deprecated it either. It is standardized by CRYPTREC in Japan, and also has not been officially deprecated there, as far as I can tell, although it is on their "monitored list".

It could be considered deprecated if there are known attacks that greatly weaken the algorithm, or if known collisions exist, such as MD5. However, RIPEMD-160 does not have any known weaknesses nor collisions. The simplified versions of RIPEMD do have problems, however, and should be avoided. But as it stands, RIPEMD-160 is still considered "strong" and "cryptographically secure". Being that it was first published in 1996, almost twenty years ago, in my opinion, that's impressive. Compared to SHA1, another 160-bit digest, which was first published in 1995, the first published attack against SHA-1 was published just 8 years later, in 2003, and attacks have been pouring out since.

In fact, maybe Valerie is calling RIPEMD-160 "deprecated", because it's old, and there are plenty of other functions with larger digests, such as the new proposed FIPS standard SHA3/Keccak. The only problem with that, is that while we may be able to call these secure now, they too could fall victim to various attacks, and be severely weakened. It could also be possible that RIPEMD-160 would still be considered strong and cryptographically secure.

Granted, RIPEMD-160 has not received the attention that SHA1 has. As such, it is likely that it has not gotten the mathematical attention of the cryptographic community. I understand that. However, RIPEMD-160 is part of the OpenPGP standard, and available in many cryptographic libraries for many different programming languages. While it's not the running favorite of cryptographers and developers, it also hasn't been overlooked.

The only concern I would have against RIPEMD-160 is the 160-bit output digest size. Is that large enough to withstand a sophisticated brute force search? To answer that question, we can look at the Bitcoin distributed network, likely the largest distributed computing network in the world. The Bitcoin network, at the time of this post, is currently calculating approximately 60 million billion SHA256 hashes per second, largely using specialized hardware called "ASICs". 60 million billion is 60 quadrillion, or 6x10^16. 160-bits is approximately 1.5x10^48. This means it is taking the Bitcoin distributed network approximately 2.4x10^31 seconds to completely exhaust the RIPEMD-160 digest space, or about 7.7x10^23 years. So even then, at the amazing brute force pace of 60 million billion hashes per second, it's still unreasonable to find legitimate collisions for a 160-bit digest. Even using the Birthday Attack, the likelihood of finding a collision is 50% in 890,362,173,273 years at that pace. RIPEMD-160 still stands strong.

Now, I'm an amateur cryptographer, and a lousy one at that. But my Internet searching has left me empty handed in trying to find any resource that successfully attacks, weakens, or criticizes RIPEMD-160, let alone calling it "deprecated". From everything I can tell, it's withstood the test of time, and it's still going very, very strong.

SHA3 (Keccak) in Linux

For a long time, I've been waiting to use the newly accepted SHA3 in Linux for file integrity and other uses. Like the md5sum(1), sha1sum(1), sha224sum(1), sha256sum(1), sha384sum(1), and sha512sum(1), I was hoping that a similar "sha3-224sum(1)", etc would be developed, and make its way into the GNU/Linux library. Unfortunately, I kept waiting and waiting, until eventually, I just stopped worrying about it. Well, to my surprise, it appears that there is a package that ships SHA3, as accepted by NIST in the rhash package (it also does a number of other hashes as well).

Keccak was chosen by NIST as the SHA3 winner, due to it's performance, security and construction. Keccak uses the sponge function for creating the cryptographic hash, which truly sets it apart from SHA1 and SHA2. This means any successful attack against SHA1 or SHA2 will likely be ineffective on SHA3. SHA3 clams 12.5 cycles per byte on an Intel Core 2 CPU in a software implementation. Unfortunately, it appears that SHA3 as it appears in the rhash package still needs some optimizations, as SHA2, which requires more cycles per byte due to its construction, can calculate a SHA2-256 hash faster than a SHA3-256 hash. SHA3 support was added in September 2013.

The implementation of SHA3 in rhash uses the offical acceptance of the original Keccak function as approved by NIST. This means that it does not contain the 2 bits "01" which are appended to the message for padding. It should be noted that SHA3 is only a FIPS draft as of the time of this blog post. As such, outputs could change until the standard is formalized.

Below are examples of hashing:

$ echo -n "" | rhash --sha3-224 -
f71837502ba8e10837bdd8d365adb85591895602fc552b48b7390abd  (stdin)
$ echo -n "" | rhash --sha3-256 -
c5d2460186f7233c927e7db2dcc703c0e500b653ca82273b7bfad8045d85a470  (stdin)
$ echo -n "" | rhash --sha3-384 -
2c23146a63a29acf99e73b88f8c24eaa7dc60aa771780ccc006afbfa8fe2479b2dd2b21362337441ac12b515911957ff  (stdin)
$ echo -n "" | rhash --sha3-512 -
0eab42de4c3ceb9235fc91acffe746b29c29a8c366b7c60e4e67c466f36a4304c00fa9caf9d87976ba469bcbe06713b435f091ef2769fb160cdab33d3670680e  (stdin)
$ echo -n "The quick brown fox jumps over the lazy dog." | rhash --sha3-224 -
c59d4eaeac728671c635ff645014e2afa935bebffdb5fbd207ffdeab  (stdin)
$ echo -n "The quick brown fox jumps over the lazy dog." | rhash --sha3-256 -
578951e24efd62a3d63a86f7cd19aaa53c898fe287d2552133220370240b572d  (stdin)
$ echo -n "The quick brown fox jumps over the lazy dog." | rhash --sha3-384 -
9ad8e17325408eddb6edee6147f13856ad819bb7532668b605a24a2d958f88bd5c169e56dc4b2f89ffd325f6006d820b  (stdin)
$ echo -n "The quick brown fox jumps over the lazy dog." | rhash --sha3-512 -
ab7192d2b11f51c7dd744e7b3441febf397ca07bf812cceae122ca4ded6387889064f8db9230f173f6d1ab6e24b6e50f065b039f799f5592360a6558eb52d760  (stdin)

In my limited testing, it appears that the SHA3 implementation in rhash(1) is not quite up to par, and could use some additional performance improvements. I'm sure these will be committed over time. However, it's hardly a poor performer. I've been very happy with the performance results so far.

The Core Problem With Ubuntu Releases - Little QA

This post has a background bug that was introduced just over a week ago, ONE THE DAY BEFORE ITS RELEASE. The bug is being able to bypass the lock screen by just holding down your <Enter> key, letting Unity crash, then restarting without locking the desktop. That's a pretty big bug. What's interesting about this bug though, isn't the bug itself, but a bigger problem withing the Ubuntu and Canonical development teams. That is, that the release date is more important than the quality of the release.

I blogged about this six years ago when I was an Ubuntu Member, and my blog was on the Ubuntu Planet (it's weird to think that I've had this blog now for almost 10 years. Heh). I mentioned that I was moving my server away from Ubuntu 8.04 LTS to Debian 5.0. My rationale was clear: Debian is much more interested in making sure that "Debian stable" is actually rock-solid stable. If that means that it takes three years before the next release is ready, then so be it. Stable is stable. Ubuntu could learn a lot from this.

So, what does Debian do so differently than Ubuntu, and how can Ubuntu learn from this? Let's take a look at how packages transition from the "unstable" release to the "testing" release in Debian, and how "testing" becomes "stable".

  1. After the package has been in unstable for a given length of time, it can qualify for migration to testing. This depends on each package, and the urgency of the migration.
  2. The package can only enter testing if no new release critical bugs exist. This means, that the package must have fewer release critical bugs than the current package in testing.
  3. All dependencies needed for the package must be satisfiable in the current testing repository. If not, those packages must be brought in at the time the current package is, and they must meet the same criteria related to time in unstable and the number of release critical bugs.
  4. The package migrating to testing must not break any other packages currently in testing.
  5. It must be compiled for all release architectures it claims to support, and all architecture specific packages must be brought in as well, meeting the same criteria as mentioned.

Point #1 gives the package enough time for people in the community to report any immediate or obvious bugs. If nothing is reported, then it's possible that the new package version is an improvement over what is currently in testing, and as such, the package becomes a migration candidate. This doesn't mean the package migrates. Instead, it must now be evaluated in terms of quality, and this is handled by evaluating its release bug count.

Point #2 is all about release bug count. If the release bug count is equal to or higher than what is currently in testing, then the package is not an improvement, and is not going to get migrated into testing. If the release bug count is fewer than what is in testing, then migrating the package from unstable into testing improves the quality of the testing release.

Point #3 makes sure that dependencies are met with the package migration from unstable into testing. All dependencies must meet the criteria of points #1 and #2 to be candidates for migration.

Point #4 ensures that current packages in testing do not break when the new package is migrated. Not only the current package dependencies, but also other packages that might interact with it.

Point #5 brings in CPU architecture support. Debian is a universal operating system, handling many different CPU architectures. Not all CPU architectures have the same packages. But, if the package was previously compiled for a certain architecture, then it must continue to be compiled for that architecture. IE- no orphaned packages. Points 1-4 must be met for every architecture it claims to support.

If points 1-5 are successfully met, then the package can be migrated from unstable into testing, improving the quality of the testing release. You can track the release critical bug status in the current testing release at https://bugs.debian.org/release-critical/. As of this writing, there are currently 384 release critical bugs that will prevent testing from becoming the new stable release (the green line in this graph). When the release critical bugs near zero, then and only then is a new release of Debian ready for general consumption.

Debian release critical bugs graph.

I know I've highlighted how the Debian release works, but how about other operating system vendors?

  • Windows releases when it's ready.
  • Mac OS X releases when it's ready.
  • RHEL releases when it's ready.
  • CentOS releases when it's ready (following RHEL).
  • SLES releases when it's ready.
  • Slackware releases when it's ready.
  • FreeBSD releases when it's ready.
  • NetBSD releases when it's ready.
  • Solaris releases when it's ready.

In fact, of all of the operating system vendors I can find, only Ubuntu and their derivatives (Linux Mint, *buntu, etc.) and OpenBSD stick to a rigid schedule of releasing every six months. Other than Ubuntu, none of of those operating system vendors see large scale mission critical server deployment. Debian does. RHEL and CentOS do. Windows certainly does. Even FreeBSD sees enormous success in the datacenter. A large part of that, I would bet, is the fact that stable means stable. "Mission critical" just is not synonymous with Ubuntu servers.

According to https://launchpad.net/ubuntu/trusty/+bugs, as of the time of this writing, almost 2 weeks after 14.04 LTS released, there are:

  • 88 New bugs
  • 301 Open bugs
  • 19 In-progress bugs
  • 10 Critical bugs
  • 78 High importance bugs
  • 4 Incomplete bugs (can expire)

Here's a beauty, reported in December 2013, against 13.10: Reinstallation wipes out all/other partitions. People are losing their Windows partitions. That's lost data. This isn't a laughing matter. And 14.04 STILL released? No way in hell would any other vendor release their OS with a bug of that size. Let's hope it's fixed before 14.10 releases.

Even though I'm critical of Ubuntu's spyware on the desktop, and their CLA and other business practices and development, I would love to see them do things right. Unfortunately, releasing every 6 months, regardless, is not doing things right. It doesn't put a lot of trust into system administrators and into executives to run their production environment.

The Drunken Bishop For OpenPGP Keys

Almost a year ago, I blogged about the drunken bishop algorithm for OpenSSH key random art. Towards the end of the post, I mentioned that I would be building an OpenPGP implementation. I started doing so in Python, but eventually got sidetracked with other things. Well, I hosted the Scale 12x PGP keysigning party, and after the fact wished that I had finished the OpenPGP drunken bishop implementation. Well, I'll be hosting another PGP keysigning party at OpenWest 2014 next month, and couldn't risk missing another keysigning party where I could introduce the drunken bishop to attendees there.

The OpenPGP implementation follows ALMOST the exact algorithm. The only differences are this:

  1. OpenSSH key fingerprints are MD5 checksums while OpenPGP key fingerprints are SHA1 checksums. As such, a larger board is introduced.
  2. The algorithm for OpenSSH key fingerprints reads the fingerprint in little endian. I felt that this unnecessarily complicates the codebase, so I read the fingerprint in big endian. Turns out, this wasn't that big of a deal code-wise, and the reason for the little-endian reading is to "mix" the key, preventing fingerprints that start the same, from looking the same.
  3. For the ASCII art output, the PDF paper mentions that as the bishop visits a square more often, the ASCII character should increase in weight gradually. Unfortunately, it appears this wasn't strictly observed. I've changed the characters that are shown in the art to more accurately reflect the frequency of visits to each square. The starting 'S' and the ending 'E' remain the same.
  4. I added an extra label at the bottom of the random art to depict the 8-bit OpenPGP key ID for that art.

Because it's written in Python, it is slower than its C counterpart for OpenSSH. If I were motivated enough to care, I'd write it in C. But, it's plenty fast for what it's doing, and I'm sure there are performance optimizations I can make to speed things up in the Python script.

I took 9 random keys from my GnuPG public keyring, and am showing them here, as a screenshot, rather than the horrible rendering that is web browsers:

Terminal screenshot showing 9 OpenPGP random art images

As with OpenSSH, these random art images for each public key exists to make verifying OpenPGP keys a bit easier. No doubt that there exist collisions, where more than one key produce the same artwork. When in doubt, verify the hex fingerprint. However, this may be a good alternate method for speeding up keysigning parties, by having each individual verify their key individually, then only speaking up if they feel their key is in error. This way, parties can go straight to identification checking, this eliminating 1/2 to 2/3 of the party. During identification checking, each individual could verify the key random art, fingerprint, etc. while checking photo identification.

A possible handout format for the random art images could look something like this:

+----[DSA 1024]-----+	
|l^  . ^^?^.        |	1) ID: 22EEE0488086060F
|l. . .l:.?         |	UID: Aaron Toponce 
|^ E ...ll          |	Fingerprint: E041 3539 273A 6534
|^. .  ^:.          |	             A3E1 9259 22EE E048
|^ . .  ..          |	Size/Type: 1024/DSA
| . .   . S         |	
|  . .   . .        |	Matches?
|   .     .         |	Fingerprint: ____ ID: ____ Signed: ____
|                   |	
|                   |	
|                   |	
+----[8086060F]-----+

Of course, this is less efficient on paper usage than the traditional table, but it could be improved.

The source code for generating the OpenPGP random art is at https://github.com/atoponce/keyart. It requires an OpenPGP public key file as an argument when executing the script, like so:

$ ./keyart 8086060F.pgp

Hopefully, PGP keysigning party organizers find this useful in streamlining the keysigning process. If not, at least it could be entertaining.

Time Based One Time Passwords - How It Works

Introduction

With all the news about Heartbleed, passwords, and two-factor authentication, I figured I would blog about exactly how two-factor authentication can work- in this case, TOTP, or Time based one time passwords, as defined by The Initiative for Open Authentication (OATH). TOTP is defined in RFC 6238, and is an open standard, which means anyone can implement it, with no worries about royalty payments, or copyright infringements. In fact, TOTP is actually just an extension of HOTP (HMAC based one time passwords), which is defined in RFC 4226. I'll describe HOTP a little bit in this post, but focus primarily on TOTP.

What is two-factor authentication?

First, let's describe the problem. The point of two-factor authentication is to prevent attackers from getting access to your account. Two-factor authentication requires that two tokens be provided for proof of ownership of the account. The first token is something that we're all familiar with- a username and a password. The second token is a bit more elusive, however. It should be something you have, and only you. No one else should be able to come into possession with the same token. The same should be true for usernames and passwords, but we've seen how easily broken a single-factor authentication system is.

Think of two-factor authentication as layers on an onion. The first layer is your username and password. After peeling away that layer, the second layer is a secret token. Three-factor authentication is also a thing, where you can provide something that you are, such as a fingerprint or retinal scan. That would be the third layer in our onion. You get the idea.

However, the second token should not be public knowledge. It must be kept secret. As such, a shared secret is generated between the client and the server. For both HOTP and TOTP, this is just a base-32 random number. This random number, along with a message turns into an HMAC-SHA1 cryptographic hash (which is defined in RFC 2104; also described at Wikipedia).

Importance of Time

Unfortunately, a "shared secret" is a fairly lame form of authentication. First, the user could memorize the secret, no longer making it something the user has. Second, man in the middle attacks on shared secrets are extremely effective. So, we need the ability to prevent the user from memorizing the shared secret, and we need to make man in the middle attacks exceptionally difficult. As such, we turn our shared secret into a moving target.

HOTP, as already mentioned, is the base from which TOTP comes. HOTP uses the same algorithm as described below in this post, except that rather than using time as the moving factor, an 8-byte counter is changed. We need a moving target, because if the token were static, it would be no different than just a second password. Instead, we need the attacker to be constantly guessing as to what the token could be. So, with HOTP, a token is valid until used. Once used, the counter is incremented, and the HMAC-SHA-1 string is recalculated.

TOTP uses the UNIX epoch as its time scale, in seconds. This means that for TOTP, time starts with January 1, 1970 at 00:00.00, and we count the number of seconds that have elapsed since then. By default, we only look at 30 second intervals, on the minute. This means at the top of the minute (zero seconds past the minute), TOTP is refreshed, and again 30 seconds after the minute. TOTP uses a shared secret between the client and the server, so it's important that both the client and the server clocks are synchronized. However, RFC 6238 does allow for some clock skew and drift. As such, when a code is entered, the server will allow for a 30 second window on either side of the code. This means that the code is actually valid for a maximum of 90 seconds. If using NTP on the server, and a mobile phone for the client, then clock drift isn't a concern, as both will be continuously updated throughout the day to maintain accurate time. If NTP or GSM/CDMA broadcasts are not adjusting the clock, then it should be monitored and adjusted as needed.

Computing the HMAC-SHA-1

Hash-based message authentication codes (HMAC) require two arguments to calculate the hash. First, they require a secret key, and second, they require a message. For TOTP (and HOTP), the secret key is our shared secret between the client and the server. This secret never changes, and is the foundation from which our HMAC is calculated. Our message, is what changes. For TOTP, is the time in seconds, since the UNIX epoch rounded to the nearest 30 seconds. For HOTP, it is the 8-byte counter. This moving target will change our cryptographic hash. You can see this with OpenSSL on your machine:

$ KEY=$(< /dev/random tr -dc 'A-Z0-9' | head -c 16; echo)
$ echo $KEY
WHDQ9I4W5FZSCCI0
$ echo -n '1397552400' | openssl sha1 -hmac "$KEY"
(stdin)= f7702ad6254a06f33f7dcb952000cbffa8b3c72e
$ echo -n '1397552430' | openssl sha1 -hmac "$KEY" # increment the time by 30 seconds
(stdin)= 70a6492f088785444fc664e1a66189c6f33c2ba4

Suppose that our HMAC-SHA1 string is "0215a7d8c15b492e21116482b6d34fc4e1a9f6ba". We'll use this image of our HMAC-SHA-1 to help us identify a bit more clearly exactly what is happening with our token:

Image clarifying the HMAC-SHA-1 string, by using 20 divisions of 1 byte each (two characters per division).

Dynamic Truncation

Unfortunately, requiring the user to enter a 40-character hexadecimal string in 30 seconds is a bit unrealistic. So we need a way to convert this string into something a bit more manageable, while still remaining secure. As such, we'll do something called dynamic truncation. Each character occupies 4-bits (16-possible values). So, we'll look at the lower 4-bits (the last character of our string) to determine a starting point from which we'll do the truncating. In our example, the last 4-bits is the character 'a':

Same 20-division image, highlighting the last character of the HMAC-SHA-1 string- character 'a'.

The hexadecimal character "a" has the same numerical value as the decimal "10". So, we will read the next 31-bits, starting with offset 10. If you think of your HMAC-SHA-1 as 20 individual 1-byte strings, we want to start looking at the strings, starting with the tenth offset, of course using the number "0" as our zeroth offset:

Image clarifying the HMAC-SHA-1 string, by using 20 divisions of 1 byte each (two characters per division).

As such, our dynamically truncated string is "6482b6d3" from our original HMAC-SHA-1 hash of "0215a7d8c15b492e21116482b6d34fc4e1a9f6ba":

Same 20-division image, highlighting the first 31-bits of the HMAC-SHA-1 string, starting with the 10th offset.

The last thing left to do, is to take our hexadecimal numerical value, and convert it to decimal. We can do this easily enough in our command line prompt:

$ echo "ibase=16; 6482B6D3" | bc
1686288083

All we need now are the last 6 digits of the decimal string, zero-padded if necessary. This is easily accomplished by taking the decimal string, modulo 1,000,000. We end up with "288083" as our TOTP code:

TOTP: 288083

The user then types this code into the form requesting the token. The server does the exact same calculation, and verifies if the two codes match (actually, the server does this for the previous 30 seconds, the current 30 seconds, and the next 30 seconds for clock drift). If the code provided by the user matches the code calculated by the server, the token is valid, and the user is authenticated.

Conclusion

TOTP, and alternatively HOTP, is a great way to do two-factor authentication. It's based on open standards, and the calculations for the tokens are done entirely in software, not requiring a proprietary hardware or software solution, such as provided by RSA SecurID and Verisign. Also, the calculations are done entirely offline; there is no need to communicate with an external server for authentication handling at all. No calling home to the mothership, to report someone has just logged in. It's platform independent, obviously, and is already implemented in a number of programming language libraries. You can take this implementation further, by generating QR codes for mobile devices to scan, making it as trivial as installing a mobile application, scanning the code, and using the tokens as needed.

Two Factor Authentication with OpenSSH

With all the news about Heartbleed, passwords and two-factor authentication, I figured I would finally get two-factor authentication working with my SSH servers. I've known about it in the past, but haven't done anything about it. Now is the time.

To get two-factor authentication working with your OpenSSH server, you need to install the "libpam-google-authenticator" PAM module on your system. Don't let the package name fool you, however. It is developed by Google, but it does not "phone home" to Google servers at all. It also does not require a Google account. Further, the PAM module is Free and Open Source software, licensed under the Apache 2.0 license.

To install the module on Debian-based systems, run the following:

$ sudo aptitude install libpam-google-authenticator

Once installed, run the google-authenticator(1) command, and answer the resulting questions. The questions offer a balance between increased security and convenience of use. You have the opportunity to create an HMAC-Based One-time Password (HOTP), as specified in RFC 4226, or a Time-based One-time Password (TOTP), as specified in RFC 6238. The only difference, is that with HOTP, each code is dependent on the previous codes, whereas with TOTP, each code is dependent on time. If running NTP on your SSH server, and you're using this with a phone, then TOTP would probably be a better bet, as time will be very closely synchronized, and unlikely to fall out of sync.

The command will create an ANSI QR code that you can scan on your phone to setup the codes. Further, it will print 5 backup codes in the event that you lose your phone, or don't have it, but need to login. Print those backup codes, and store them in your wallet. Just in case you need the codes, they are stored in your ~/.google_authenticator file:

$ cat ~/.google_authenticator 
YXSNFX37ZUZCKVZM
" RATE_LIMIT 3 30
" WINDOW_SIZE 17
" DISALLOW_REUSE
" TOTP_AUTH
74110971
51742064
84348069
78844952
28772212

Now you just need to configure OpenSSH to use OTP as part of the authentication process. There are two configuration files to edit. First, you need to edit the /etc/pam.d/sshd config file, and put "auth required pam_google_authenticator.so" at the bottom of the file:

$ sudo vim /etc/pam.d/sshd
(move to the bottom of the file)
auth required pam_google_authenticator.so

Now change the /etc/ssh/sshd_config file, to allow a challenge and response as part of the authentication process:

$ sudo vim /etc/ssh/sshd
ChallengeResponseAuthentication yes

Restart your OpenSSH server, and you should be good to go:

$ sudo service ssh restart

For an OTP application that you can install on your Android phone, I would personally recommend FreeOTP by Red Hat. It's Free and Open Source software, unlike the Google Authenticator app, and it has increased security in that codes are not displayed by default. Instead, you must tap the refresh button for that site to display the code. Also, I like the display options better with FreeOTP than with Authenticator, but that's personal choice. There currently is not an iOS version of the application in the Apple App Store, but my understanding is that one will make it there soon.

Anyway, happy two-factor authentication on your OpenSSH server!

Heartbleed And Your Passwords

Recently it was discovered that OpenSSL contained a pretty massive security hole that allowed simple TLS clients to retrieve plain text information from a TLS-protected server using the TLS Heartbeat. The advisory is CVE-2014-0160. This has to be one of the most dangerous security vulnerabilities to hit the Internet in a decade. More information can be found at https://heartbleed.com/ (ironically enough, using a self-signed certificate as of this writing).

I don't wish to cover all the extensive details of this vulnerability. They are large and high in number. However, I do want to address it from the point of passwords, which is something I can handle on this blog. First, let's review the heartbleed vulnerability:

  1. FACT: This vulnerability was introduced into OpenSSL on December 31, 2011.
  2. FACT: This vulnerability was fixed on April 5, 2014.
  3. FACT: Without the TLS heartbeat bounds check, data is leaked in 64KB chunks.
  4. FACT: A TLS client can expose the server's SSL private key, to decrypt future communication, without any special hardware or software.
  5. FACT: This is not a man-in-the-middle (MITM) attack.
  6. FACT: Armed, an attacker can reveal usernames and passwords without anyone knowing.

To demonstrate this, here is a post by Ronald Prins on Twitter:

He demonstrates it on his blog, showing a censored example. What does this mean? This means that if you logged into a server, using SSL during the past two years, your username and password could already be compromised. This includes your email acount, your bank account, or social media accounts, and any others. Of course, if the service takes advantage of two-factor authentication, then that specific account is likely safe. However, if you share passwords between accounts, additional accounts may not be.

I really wish I was joking.

My advice? Time to change passwords to accounts you think you may have used over the past two years. But, before you do so, you need to know if the service is protected against Heartbleed. You can use http://possible.lv/tools/hb/ and http://filippo.io/Heartbleed/ as online Heartbleed testers before logging in with your account credentials.

Protect Against Bit Rot With Parchive

Introduction

Yes, this post was created on April 1. No, it's not an April Fool's joke.

So, I need to begin with post with a story. In 2007, I adopted my daughter, and my wife decided that she wanted to stay home rather than work. In 2008, she quit her job. She was running a website on my web server, which only had a 100G IDE PATA drive. I was running low on space, so I asked if I could archive her site, and move it to my ext4 Linux MD RAID-10 array. She was fine with that, so off it went. I archived it using GNU tar(1), and compressed the archive with LZMA. Before taking her site offline, I verified that I could restore her site with my compressed archive.

From 2008 to 2012, it just sat on my ext4 RAID array. In 2012, I built a larger ZFS RAID-10, and moved her compressed archive there. Just recently, my wife has decided that she may want to go back to work. As such, she wants her site back online. No problem, I thought. I prepared Apache, got the filesystem in order, and decompressed the archive:

$ unlzma site-backup.tar.lzma
unlzma: Decoder error

Uhm...

$ tar -xf site-backup.tar.lzma
xz: (stdin): Compressed data is corrupt
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now

Oh no. Not now. Please, not now. My wife has put in hundreds of hours into this site, and many, many years. PDFs, images, other media, etc. Tons, and tons of content. And the archive is corrupt?! Needless to say, my wife is NOT happy, and neither am I.

I'm doing my best to restore the data, but it seems that all hope is lost. From the GNU tar(1) documentation:

Compressed archives are easily corrupted, because compressed files have little redundancy. The adaptive nature of the compression scheme means that the compression tables are implicitly spread all over the archive. If you lose a few blocks, the dynamic construction of the compression tables becomes unsynchronized, and there is little chance that you could recover later in the archive.

I admit to not knowing a lot about the internal workings of compression algorithms. It's something I've been meaning to understand more fully. However, best I can tell, this archive experienced bit rot in the 4 years it was on my ext4 RAID array, and as a result, the archive is corrupt. Had I known about Parchive, and that ext4 doesn't offer any sort of self healing with bit rot, I would have been more careful storing that compressed archive.

Because most general purpose filesystems do not protect against silent data errors, like Btrfs or ZFS, there is no way to fix a file that suffers from bit rot, outside of restoring from a snapshot or backup, or the internal file type has some sort of redundancy. Unfortunately, I learned that compression algorithms have very little to no redundancy in the final compressed file. It makes sense, as compression algorithms are designed to remove redundant data, either using lossy or lossless techniques. However, I would be willing to grow the compressed file, say 15%, if I knew that I could suffer some damage on the compressed file, and still get my data back.

Parchive

Parchive is a Reed-Solomon error correcting utility for general files. It does not handle Unicode, and it does not work on directories. It's initial use was to maintain the integrity of Usenet posts on the Usenet server against bit rot. It has since been used by anyone who is interested in maintaining data integrity for any sort of regular file on the filesystem.

It works by creating "parity files" on a source file. Should the source file suffer some damage, up to a certain point, the parity files can rebuild the data, and restore the source file, much like parity in a RAID array when a disk is missing. To see this in action, let's create a 100M file full of random data, and get the SHA256 of that file:

$ dd if=/dev/urandom of=file.img bs=1M count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 7.76976 s, 13.5 MB/s
$ sha256sum file.img > SUM
$ cat SUM
5007501cb7f7c9749a331d1b1eb9334c91950268871ed11e36ea8cdc5a8012a2  file.img

Should this file suffer any sort of corruption, the resulting SHA256 hash will likely change. As such, we can protect the file against some mild corruption with Parchive (removing newlines and cleaning up the output a bit):

$ sudo aptitude install par2
(...snip...)
$ par2 create file.img.par2 file.img
(...snip...)
Block size: 52440
Source file count: 1
Source block count: 2000
Redundancy: 5%
Recovery block count: 100
Recovery file count: 7
Opening: file.img
Computing Reed Solomon matrix.
Constructing: done.
Wrote 5244000 bytes to disk
Writing recovery packets
Writing verification packets
Done

As shown in the output, the redundancy of file is 5%. This means that we can suffer up to 5% damage on the source file, or the parity files, and still reconstruct the data. This redundancy is default behavior, and can be changed by passing the "-r" switch.

If we do a listing, we will see the following files:

$ ls -l file.img*
-rw-rw-r-- 1 atoponce atoponce 104857600 Apr  1 08:06 file.img
-rw-rw-r-- 1 atoponce atoponce     40400 Apr  1 08:08 file.img.par2
-rw-rw-r-- 1 atoponce atoponce     92908 Apr  1 08:08 file.img.vol000+01.par2
-rw-rw-r-- 1 atoponce atoponce    185716 Apr  1 08:08 file.img.vol001+02.par2
-rw-rw-r-- 1 atoponce atoponce    331032 Apr  1 08:08 file.img.vol003+04.par2
-rw-rw-r-- 1 atoponce atoponce    581364 Apr  1 08:08 file.img.vol007+08.par2
-rw-rw-r-- 1 atoponce atoponce   1041728 Apr  1 08:08 file.img.vol015+16.par2
-rw-rw-r-- 1 atoponce atoponce   1922156 Apr  1 08:08 file.img.vol031+32.par2
-rw-rw-r-- 1 atoponce atoponce   2184696 Apr  1 08:08 file.img.vol063+37.par2

Let's go ahead and corrupt our "file.img" file, and verify that the SHA256 hash no longer matches:

$ dd seek=5k if=/dev/zero of=file.img bs=1k count=128 conv=notrunc
128+0 records in
128+0 records out
131072 bytes (131 kB) copied, 0.00202105 s, 64.9 MB/s
$ sha256sum -c SUM
file.img: FAILED
sha256sum: WARNING: 1 computed checksum did NOT match

This should be expected. We've corrupted our file through a simulated bit rot. Of course, we could have corrupted up to 5% of the file, or 5 MB worth of data. Instead, we only wrote 128k of bad data.

Now that we know our file is corrupt, let's see if we can recover the data. Parchive to the rescue (again, cleaning up the output):

$ par2 repair -q file.img.par2 
(...snip...)
Loading "file.img.par2".
Loading "file.img.vol031+32.par2".
Loading "file.img.vol063+37.par2".
Loading "file.img.vol000+01.par2".
Loading "file.img.vol001+02.par2".
Loading "file.img.vol003+04.par2".
Loading "file.img.vol007+08.par2".
Loading "file.img.vol015+16.par2".
Target: "file.img" - damaged. Found 1996 of 2000 data blocks.
Repair is required.
Repair is possible.
Verifying repaired files:
Target: "file.img" - found.
Repair complete.

Does the resulting SHA256 hash now match?

$ sha256sum -c SUM 
file.img: OK

Perfect! Parchive was able to fix my "file.img" by loading up the parity files, recalculating the Reed-Solomon error correction, and using the resulting math to fix my data. I could have also done some damage to the parity files to demonstrate that I can still repair the source file, but this should be sufficient to demonstrate my point.

Conclusion

Hindsight is always 20/20. I know that. However, had I known about Parchive in 2008, I would have most certainly used it on all my compressed archives, including my wife's backup (in reality, a backup is not a backup unless you have more than one copy). Thankfully, the rest of the compressed archives did not suffer bit rot like my wife's backup did, and thankfully, all of my backups are now on a 2-node GlusterFS replicated ZFS data store. As such, I have 2 copies of everything, and I have self healing against bit rot with ZFS.

This is the first time in my life where bit rot has reared its ugly head, and really caused a great deal of pain in my life. Parchive can fix that for those who have sensitive data where data integrity MUST be maintained on filesystems that do not provide protection against bit rot. Of course, I should have also had multiple copies of the archive to have a true backup. Lesson learned. Again.

Parchive is no substitution for a backup. But it is a protection against bit rot.

Creating Strong Passwords Without A Computer, Part III - Off The Grid

Previously, I used entropy as a backdrop for creating strong passwords. It's important that you read that article and fully understand it before moving on with the rest of the series.

So far, I've blogged about generating passwords using systems that your grandma could use. In this case, I have less confidence that my grandma would be willing to give this a go. Not because she's stupid, but because she's impatient. The older I get, the less patience I have for things like this, so I can understand where she's coming from. In this post, we'll be discussing Steve Gibson's paper cipher Off The Grid.

Introduction

Off The Grid is a paper-based cipher for encrypting domain names. The concept is built around the idea of using Latin Squares as a means for creating the cipher. Off The Grid is a 26x26 Latin Square using the English alphabet. In other words, any character appears only one in any given row and column. As a result of the Latin Square, words can be traversed throughout the square, alternating rows and columns. This will be explained further.

Outside of the grid are numbers and non-alphabetic characters. These are used as an additional resource when creating the passwords for your sites. Because the grid itself is a randomized mixture of lowercase and uppercase letters, the grid itself has an entropy of approximately 5.7-bits per character. The outer border consists of the following characters:

0123456789!"#$%&'()*+,/:;<=>?@[$$^{|}~

As such, if using the border to build your passwords, then you have access to 90 total unique characters, bringing your entropy to approximately 6.59-bits per character.

Because we determined that 80-bits of entropy should be the minimum when generating passwords, if your password consists of only the alphabetic characters in the grid, then you should aim at building at least 15-character passwords. If you are including the outer border in your password building, then a 13-character password should be the target.

The grid and border are randomized, minus 12 characters in the center of the top border, so the resulting password is a true random password that can be used for your accounts. Thus, the requirements to build a truly random password with at least 80-bits of entropy is achieved.

Off The Grid

Off The Grid is a finite state machine. This is achieved by traversing the grid from a starting location to an ending location, the rules of which will be described here. After reaching your first ending location, a second traversal is made starting from the first ending location, and ending at a new location. There are exactly 26^2 or 676 states in the Off The Grid system.

In Steve's instructions for building passwords using the grid, you take the first 6 characters of the domain name, and build a resulting password of 12 characters- 2 ciphertext characters for each 1 plaintext character. Unfortunately, as we demonstrated, this doesn't provide us with enough entropy to withstand offline database attacks. As such, I would look at what you want for your target password length. If it's 15 characters, then I would take the first 5 characters of the domain name, and use a ratio of 3:1 rather than 2:1 when building the password. If you want a 16 character password, then you could use the first 4 characters of the domain name, and use a ratio of 4:1, or you could take the first 8 characters of the domain name, and use a ration of 2:1. Keep this in mind, because from here on out, it gets challenging, and you'll need your ratio for later.

Off The Grid can be described in the following steps:

  1. You are always using the domain name for building your passwords.
  2. Determine how many characters from the domain name you will need to build your password.
  3. Find the first character of the domain name in the first row.
  4. Find the second character of the domain name in the column below the first character.
  5. Find the third character of the domain name in the row of the previous character.
  6. Alternate rows and columns until you reach the last character of the domain name.
  7. Starting at your new position, find the first character of the domain name in that row.
  8. Overshoot in that row by the ratio you determined before. If your ratio is 2:1, overshoot 2 characters. If your ratio is 4:1, overshoot 4 characters.
  9. Write down the characters you overshot with. These will build your password.
  10. Now at your new overshot position, find the second character of the domain name in the column below the overshot character.
  11. Overshoot in that column by the ratio you determined before.
  12. Write down the characters you overshot with.
  13. Continue alternating rows and columns, overshooting by your ratio, and writing down the overshot characters until you've traversed the domain name
  14. You now have your password.

PHEW!!!

Building passwords using Off The Grid is tricky. Once you understand the above steps, and practice it, then it will become easy. But the initial understanding of how the system works is a bit a pain in the behind. In the next section, I'll be providing an example.

A Simple Example

Visiting his website, I created the following sample grid that we'll use for this post:

A Sample Off The Grid key.

Suppose we wish to setup a password for example.com, and suppose that example.com does not restrict passwords for our account in anyway. As such, we are fine building passwords with just alphabetic characters. We'll handle how to add non-alphabetic characters to our password in a minute. As such, I need to build a 15-character password. The text "example" is 7 characters. I'll take the first 5 characters "examp" and use a ratio of 3:1 for building my password.

First step is to locate the letter 'e' in the first row. You can decide to use any row, or column for your first letter. However, the goal is to be consistent, as using a different row or column for the same domain will produce different results. In these examples, I'll be starting in the first row. In this example, it's in the 25th column from the left edge. Now I locate the 'x' in the 25th column. It's in the 18th row from the top in the 25th column. Now I look along that row for 'a'. I continue in this manner, until I've worked out "examp". On my example card, my path after phase 1 would look like this:

Tracing the text 'examp' on our sample OTG key.

Now I'm in a new location ready to start creating my password. This is phase 2, the overshoot phase. Because I ended looking for 'p' in the same row as 'a', I'm going to switch to looking for 'e' in the same column as 'p' to start this phase. In other words, I am constantly alternating row, column, row, column, etc., even when switching phases. However, rather than stopping at the letter 'e', I will overshoot by by 3 characters, recording the characters I have overshot.

So, I find 'e' then continue moving 3 characters in the column, reaching 'a' in my example grid. The 3 characters that were overshot were 'Mta'. These are the first 3 characters of my domain password for example.com.

I now look for 'x' in the same row as my overshot 'a'. I will again overshoot by 3 characters to the letter 'P'. The overshot characters in that row were 'zhP', and are the second set of characters for my password. Thus, up to this point, my password for example.com has become 'MtazhP'.

I will continue alternating rows and columns, overshooting by 3 characters, writing down the overshot characters as my password, until I overshot every character in "examp". My password is "MtazhPfoYgpJvWy", and my path would have looked like this in phase 2 (the domain path is traced in red, with the overshoot characters traced in blue):

Tracing the text 'examp' again, this time overshooting by 3 characters on our sample OTG key.

Notice that my overshoot needed to wrap around the card. This should be expected in phase 2, and wrapping around the card is appropriate.

Make sure that you fully understand these two phases of creating your password before continuing. You want to make sure you get it right every time for every domain, without error.

Handling non-alphabetic characters

Now it's time to get a bit ugly. Up to this point, the OTG system only generates passwords with alphabetic characters. It does not generate passwords with anything else. This can be problematic for sites that require the use of non-alphabetic characters in the password. Further, some domains may have a '.', '-' or digit as part of the domain. These non-alphabetic characters are not part of the inner OTG grid, so creating a password with them is impossible.

This is where the outer border comes into play. First notice that a digit and non-alphanumeric character are always opposite of each other on the border. This is by design. In our case, opposite of '#' on the top border is a '2' on the bottom border. Opposite of the ',' on the bottom border, is the digit '8' on the top border, and so forth. Let's first discuss how to add non-alphabetic characters to our passwords:

Letting the OTG choose

While traversing the grid in phase 2, building your password, you can incorporate the border into your overshoot. In my previous example of "example.com", we traced out the word "examp", and overshot with 3 characters, all of the form <grid><grid><grid>. Rather, we could have used our 3rd overshoot character to be our border character. As such, our pattern would have been <grid;><grid><border>. In that case, we would stop at the 2nd overshoot character, rather than the border, to find the rest of our characters in our domain. As a result, for every character in the domain, there will be a non-alphabetic character:

Sample OTG showing a possible path for building a password with non-alphabetic characters.

In this case, the password for example.com would be "Mt.bA1oQ7cZ[IA2"

This example also brings up a good point: when the overshoot character is the same character as the next character that you need to traverse to, then skip one more character. In other words, notice that with "examp", when we travel to "x", our overshoot character is "A". This is the next character that we would be traveling to. So, I overshot one more character to "W", then traveled to my "a" in "examp". If you think that exception sucks, you're right. It does. It's just another rule that can introduce human error into the equation.

Further, notice that I needed to wrap around the card, as we already expressed earlier. Yet, my border character technically came after "c" and before "Z". However, I still put the border character at the end. I am doing this for consistency with the rest of the password, so I don't have to keep yet another rule in my head.

Letting the OTG help

In this case, some sites do not allow non-alphanumeric characters in the password. In other words, it's letters and digits only. Part of the border pattern is that there is exactly one digit in the border for every row and every column. So, rather than overshooting to a non-alphanumeric character, you could just use the number in that row or column instead. This way, our previous password would be "Mt5bA1oQ7cZ7IA2".

Tack something onto the beginning or end

Of course, you could always just tack a special character or two at the beginning or end of the password. Your logic for how this is accomplished is up to you. You could use the beginning row/column border character in phase-2, or the ending row/column border character. Whatever makes the most sense for you.

Handling domains with non-alphabetic characters

Unfortunately, some domains have non-alphabetic characters in their domain name. Although seldom, they do exist, and OTG can't completely rule out the possibility that they will not be encountered. As such, in the center of the top border are 12 characters- the digits 0 through 9, the period and the dash. If these characters are encountered as part of traveling through the OTG with the domain, then travel to the character in the first row immediately below that character in the same column. For example, if your domain had a "5" in the domain, then you would travel to "f" in the first row in our example OTG card. If there are consecutive numbers in your domain, then unfortunately, I am not sure exactly how that is to be handled.

Criticisms, Thoughts and Conclusion

DOUBLE PHEW!!!

If you have made it to this point, you're determined to learn OTG. In my opinion, Steve's OTG paper system has some pretty serious problems:

  • The rules are technical and many, making the entire system very cumbersome to use.
  • Overall, creating your passwords are slow, slow, slow. Hopefully you store them in an encrypted database for retrieval, because recreating them is very slow, and not something you want to be fighting with when under stress.
  • Due to the cumbersome nature of OTG, creating your passwords can be very error prone.
  • The size of the OTG card is too large to fit into your wallet to carry with you, and see the characters on the grid without a magnifying glass.
  • If the card is lost, and the card is attributed to you, because you are using domains to create the passwords, the attacker could easily figure out the password to your accounts with the card. As such, you may want to prepend a 1- or 2-character salt onto the domain before beginning phase 1.

These are probably my biggest criticisms of the OTG system. While slick for a paper-based encryption system, it's a lot of visual scanning and lookups, a lot of rules and exceptions to remember, and a lot of areas that can create errors. Some die-hards may use this for their passwords, but I don't think I can recommend it, even to some of the most nerdy, security conscious friends I have.

In my opinion, the PasswordCard is a much more elegant system that doesn't require the user to keep a lot of rules and exceptions in their head, and if the card is lost, the attacker still has a great amount of work to do figuring out what the password is, and to which account it belongs to.

To be fair, the OTG system is marked as "Work in Progress", so there may be adjustments to the system in the future that make it more of an elegant system for creating passwords. But I think the whole thing will need to be reworked, as traversing a Latin Square to create passwords is just too cumbersome for practical use.

Creating Strong Passwords Without A Computer, Part II - The PasswordCard

Previously, I used entropy as a backdrop for creating strong passwords. It's important that you read that article and fully understand it before moving on with the rest of the series.

Continuing our series about creating strong passwords without a computer, we look at a method I've blogged about in the past: The PasswordCard. The idea is simple: carry around a card with you, that has all of your passwords written down in plaintext, obfuscated with additional text surrounding the password. Let's look at it in more detail.

The Passwordcard Idea

The PasswordCard is a GPLv3 web application that generates a small credit-card sized card that you can fit into your wallet for password generation. The way you would generate a password is simple. Suppose you wanted to generate a password for an online account, such as an email provider. You could pull out your PasswordCard, determine a starting location, a direction, and a length, and use the resulting characters for your password. Because the PasswordCard is a two-dimensional table of characters, the direction of your password can take any direction, such as left, right, up, down, diagonally, spiral, or any combination of directions. Because the length of your password can theoretically be infinite, so too would be the search space, if someone were to get access to your card.

The PasswordCard uses the following character sets when generating a card on the site:

  • alphanumeric: "23456789abcdefghjkmnpqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ"
  • alphanumeric plus symbols: "23456789abcdefghjkmnpqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ@#$%&*<>?€+{}[]()/\"
  • numeric: "0123456789"

As such, you can expect the following entropy sizes per character when creating a card:

  • Alphanumeric (55 unique characters): 5.78-bits.
  • Alphanumeric plus symbols (76 unique characters): 6.25-bits.

When checking the box "check this for an area with only digits", it doesn't change the search space for the already existing alphanumeric output that is default when loading the page. It only creates 4 rows of digits for easy PIN generation. Your entropy per character is only increased when checking the box "check this to include symbols".

A Practical PasswordCard

Before beginning, let's use a sample card to work from. This is card 7327e6c258cb3d52, in case you also want to generate it for following along:

Sample PasswordCard

To generate a password, you would need to pick a starting location for your password. You'll notice that there are 29 columns and 8 rows. The columns are identified by symbols, while the rows are identified both by a color and a number. For each account, all you need to remember are 3 things:

  • The starting location (should be different for every account).
  • The direction (should be the same for each account).
  • The length (should be the same for each account).

This should be easier to remember than "NyFScbUJ>X7j?", but your brain might be wired differently than mine.

Because this PasswordCard is using both alphanumeric and non-alphanumeric characters on the card, and our entropy per character is approximately 6.25-bits, and further, because we established that we need a minimum of 80-bits of entropy in our passwords, we should generate at least a 13-character password off of the card.

Some PasswordCard Direction Ideas

Most of those who are using the PasswordCard to generate their passwords, are probably just reading the passwords left-to-right. I would not recommend this, as it's the lowest hanging fruit for an attacker, if they have access to a hashed password database, and your password card. Instead, I would recommend doing something non-standard for the password direction. Let's consider a few examples, each of them starting at ⊙4 on the PasswordCard:

Wall Bouncing

When hitting a wall, when either travelling diagonally or otherwise, when you hit a wall, you could "bounce" off the wall much like a pool ball would when playing billiards:

Sample PasswordCard showing a path bouncing off of the walls.

In this case, the password would be "f+zwWYB[5\C<u".

Wall Bending

Rather than bouncing off the wall like pool balls, you could "bend" along the wall, much like turning a corner when driving on a street:

Sample PasswordCard showing a path bending along the wall.

In this case, the password would be "f%efcEBnNQk7\".

Pac-Man Walls

Another alternative would be to treat the walls as a topological spheroid, much like Pac-Man. In other words, when you come to the edge of the room, you travel out the other side on the same row, or the same column, continuing in your direction:

Sample PasswordCard showing a path Pac-Man might take when traveling to the edge of our wall.

In this case, the password would be "f+zwrpt9n2B&y".

Spiral paths

My last example shows a spiral path, starting from the same location as the previous examples, this case moving clockwise:

Sample PasswordCard showing a spiral path.

In this case, the password would be "fY+%FqvGYerrz".

As a strong suggestion, whatever direction and length you take for your passwords, you should keep them consistent. If you decide to do a 13-character clockwise spiral for one account, you should do a 13-character clockwise spiral for all of your accounts. The only thing changing is the location of the password. This will greatly simplify identifying each password for each account. If you change up the lengths and directions, and well as the starting location for each account, you run the risk of having a very difficult time finding the correct password for that account. If your brain has that mental capacity, then the more power to you. Otherwise, I would keep it consistent.

A Couple Thoughts

The nice thing about the PasswordCard, is that all of your passwords are already written down for you in plaintext. However, if a password cracker were to get access to your card, they would need to know which starting location belongs to which account, the direction the password takes, as well as the length of the password. This is too many variables for an attacker to make efficient use of his time. His time would be better spent taking the character sets off of the card, and building an incremental brute-force search. Provided your password has sufficient entropy, you will likely thwart the attack.

There are a couple disadvantages with the PasswordCard, however. The first is that this is not well suited for the blind. Unlike Diceware, which can be easily adapted for the blind, this is a bit more of a challenge. While I'm not asserting that it's impossible, it certainly seems difficult to practically reproduce. The second disadvantage is the use of the "€" euro symbol. The PasswordCard is developed by a Java developer in the Euro Zone. While it makes sense for him to include the character, it alienates those that don't easily have access to it on their keyboards, such as those using the basic ASCII character set. As such, you may want to refresh your browser, generating random cards until you find one without the "€" character in its output.

Lastly, you will definitely want to print a second card, and keep it somewhere safe as a backup, should you lose the original. Keeping the card in your wallet or purse makes the most sense, as your wallet or purse is likely the most protected object in your possession, next to your phone and keys. But, should you lose the card, you will want to get access to your passwords, which will mean getting access to your backup copy.

Conclusion

I personally like the PasswordCard. It's simple, small, and doesn't require me to carry a lot of items with me, such as dice and a large word list. My only concern is being able to choose a new starting location for each account. I'm not as random as I would think when finding a starting location, so I wrote a script to handle that for me. But it's clean, out of the way, and works really well. When I don't have an account password memorized, I can pull out the card, remember where it starts, and just start typing. Generation is quick, and remember password locations is easy. Highly recommended.

Creating Strong Passwords Without A Computer, Part I - Diceware

Previously, I used entropy as a backdrop for creating strong passwords. It's important that you read that article and fully understand it before moving on with the rest of the series.

Now let's begin generating passwords. We'll start off first with Diceware.

Diceware

Diceware meets these 2 qualifications that we should use when building our passwords. However, Diceware prefers to call them "passphrases", rather than passwords, as your password will actually be multiple words stringed together, rather than just a single word. The passphrase is built by rolling 5 fair 6-sided dice (or 1 fair 6-sided die rolled 5 times), and looking up the results in a word list.

The word list comprises of 7,776 words (the total number of combinations from 5 fair 6-sided dice). Each word has a look up number that corresponds to the dice roll. For example, if you rolled "44311", then your first word from the word list would be "oint". I say "first word", because you now need to make another roll. You need to continue rolling until your passphrase contains at least 80-bits of entropy.

Because there are 7,776 possible words in the word list, then each word contains about 12.95 bits of entropy. This means you will need to roll your 5 dice seven times (six rolls will only produce 77.7-bits of entropy) to achieve the minimum. Starting with my first word, and rolling six more times, here are the results of my dice rolls:

44311 oint
12115 alum
16335 cg
64566 xs
22213 cut
43221 mutt
53143 scar

Or, "ointalumcgxscutmuttscar", which is 23 total characters in length. This is a semi-lengthy password, no doubt, but it meets our criteria to be truly random and contains sufficient entropy. Further, because the word list can be printed, you can generate secure, and strong passwords without the aid of a computer.

Variation 1- Portable Diceware

Carrying around a word list of 7,776 words might not be very practical. After all, if you store it in your wallet, assuming you can hold something about 10-by-30 characters on each side of a card, you would need to print close to 175 cards to fit all the Diceware word list. This just isn't practical. You could store the word list as a PDF, and carry it on your phone, but not everyone has a phone capable of installing a PDF reader, and we're trying to achieve this without the aid of any computing device. Let's dig further.

For carrying around only one or two cards in your wallet, we'll need to generate some tables. Thankfully the tables are small, and you can still generate secure passwords. Unfortunately, the passwords will not be as easy to remember as using the original word list. Consider the following table:

If first roll=1 or 2               3 or 4               5 or 6
           Second Roll          Second Roll          Second Roll
         1  2  3  4  5  6     1  2  3  4  5  6     1  2  3  4  5  6

T  1     A  B  C  D  E  F     a  b  c  d  e  f     !  @  #  $  %  ^     
h  2     G  H  I  J  K  L     g  h  i  j  k  l     &  *  (  )  -  =
i  3     M  N  O  P  Q  R     m  n  o  p  q  r     +  [  ]  {  }  \  
r  4     S  T  U  V  W  X     s  t  u  v  w  x     |  `  ;  :  '  "
d  5     Y  Z  0  1  2  3     y  z  ~  _  sp       <  >  /  ?  .  ,
   6     4  5  6  7  8  9

In this case, I will only need 3 fair 6-sided dice (or 1 fair 6-sided die rolled three times), rather than 5. Suppose I roll "614". The "6" means I would use the third table. The "1" means the first column in the third table, and the "4" is the fourth row in the 1st column of the third table, or "|". All 94 printable ASCII characters, plus the space, are represented in these tables. Each character gives about 6.57-bits of entropy, which means you would only need to roll your 3 fair 6-sided dice thirteen times to get enough entropy to meet our requirement for at least 80-bits of entropy.

As an example, consider the following rolls:

614 622 224 461 424 155 565 113 255 322 136 631 544

This would produce:

614 |
622 *
224 T
461 f
424 t
155 2
565 ,
113 M
255 2
322 h
136 6
631 #
544 :

Or "|*Tft2,M2h6#:" as our password. This password contains 85.41-bits of entropy, and was created at random. The characters "sp" represent the ASCII space. If you reach a table blank on any of your rolls, such as rolling "616", or "365", just roll again.

If you only need to create a password that uses just letters and numbers, then you only need to use the first table, and you only need two dice. However, each character only gives about 5.17-bits of entropy. As such, we would need a 16-character password to achieve our 80-bits minimum.

There are other variations on the tables with dice that you can use, such as generating random hexadecimal strings, random decimal numbers, special characters, and other requirements. See the Diceware FAQ for more information.

Variation 2- Dictionaryware

While carrying around a word list in your wallet or purse might not be practical, you may have a dictionary in your bookshelf, or the place you are visiting might have a dictionary you can borrow. The tricky part about dictionaries, however, is determining your search space, so you can accurately calculate entropy. Thankfully, we just need to put on our thinking caps, do a bit of math, and we can arrive at a good number.

My Merriam-Webster Dictionary contains approximately 57,000 defined words, across 820 pages of printed text. This averages to 70 dictionary words page. Each page is divided into two columns, which gives me about 35 dictionary words per column. I'll use the same 5 fair 6-sided dice I used in my initial Diceware. Because my dictionary book contains 3 numbers for its page number, the first 3 dice will tell me the page number of the dictionary. The 4th die will tell me which column the word will come from; if the die is odd (1, 3, or 5), the first (left) column is used, if the die is even (2, 4, or 6), then the second (right) column is used. The 5th die will tell me the word in that column, which means only using the first 6 words in each column.

As an example, if my roll was "56351", then I would turn to page "563", use the first column on the page, and the first word, which is "Pullman".

Obviously, there are a great number of pages skipped, and a lot of words skipped. To understand how much entropy each word provides, I need to figure out how many words are available given my limitations with 6-sided dice. First, the following pages in my dictionary are skipped:

  • 1-110 (a-calm)
  • 167-210 (convolution-disgust)
  • 267-310 (festoon-GQ)
  • 367-410 (inhale-litigious)
  • 467-510 (natty-patchwork)
  • 567-610 (QM-rumble)
  • 667-820 (stab-zymurgy)

That's a total of 484 pages eliminated from the book, which means I only have 336 valid pages to use. Because I can only choose the first 6 words from each column, or 12 words per page, that gives me 4,032 total words available to pick from. As such, each word provides about 11.98-bits of entropy, which means I need at least 7 words from my dictionary to reach my 80-bits entropy minimum for my passphrase.

As an example, if I use my rolls that I used at the beginning of this post, then my result would be:

44311 midday
12115 castled
16335 constancy
64566 skew
22213 drag
43221 maunder
53143 plantain

Or "middaycastledconstancyskewdragmaunderplantain". That's 45 characters in length, which is rather lengthy to achieve the minimum amount of entropy as our initial Diceware roll at the start of this post. This is due to the possibility of words in the English language being longer than 7 characters, which doesn't exist in our Diceware list. As such, you will likely get longer passphrases using an English dictionary versus using the Diceware list.

Some points to take into consideration when using "Dictionaryware":

Different dictionaries will need to be adjusted as necessary to accommodate the number of pages, and the number of columns. You just need to make sure that the dice are picking the word, and not you. If your dictionary is smaller than 600 pages, you may need to come up with a system handling the numbers 0, 7, 8, & 9 to get sufficient entropy. Additional dice rolls or a look up table could work, but it complicates the process.

Some dictionaries might define a word two, three, or times, based on it being a noun, verb, adjective or abbreviation. This will reduce our total search space, which will reduce our entropy per word. So, in my example of 11.98-bits of entropy per word, this is a maximum. It may require a bit more work to determine a more accurate entropy estimate.

Variation 3- Coinware

Even carrying around dice can be impractical. However, it is much more likely that you are carrying around spare change in your pockets, or have some sitting in a desk drawer at work. Provided that the coin flips fairly between heads and tails, you can flip a coin to build your passphrase.

Preferably, you'll want to use 3 separate coins (penny, nickel, dime), but if you only have a single penny, or 3 pennies, that will work too. The idea is that you toss the three coins, which will identify a single throw of a die. So, 15 coin tosses will determine your 5 dice rolls. Using Coinware requires the following look up table:

    Results of Coin Toss
      Penny Nickel Dime
 
D  1    T     T     T
i  2    T     T     H
e  3    T     H     T
   4    T     H     H
R  5    H     T     T
o  6    H     T     H
l  *    H     H     T
l  *    H     H     H

If your coin tosses produce a "*", re-flip your coins. As such, to get the dice roll of "44311", I would have needed to get the following coin flips:

THH 4
THH 4
THT 3
TTT 1
TTT 1

This would produce my word "oint" from the Diceware word list. I would then need to proceed six more times to get my seven words necessary for reaching my 80-bits of entropy minimum. If you think that flipping 3 coins for 1 die roll is a lot of work, you're right. It is. You would be better off getting some dice.

Dice Considerations

I would be amiss if I didn't mention something about the randomness of dice itself. No doubt, dice can be loaded, imbalanced, burned, and/or lopsided to favor certain sides. For obvious reasons, you should avoid using "bad dice" when generating your Diceware passphrases. You want random to be on your side as much as possible (you want a "strong password", don't you?). Every side of each die should be equally as likely as the other five sides.

Unfortunately, "gaming dice" that you get from your local hobby store, or that come with board games, aren't fair 6-sided dice. But, they're probably "good enough". In other words, one die may favor a "4" due to imperfections in the die material, weighting it on the "3", while another die may favor a "6", because it has had more material drilled out of it than the "1". Finally, one die may have a more chamfered corner or edge than the rest of the corners or edges, slightly favoring a specific number or pair of numbers. Taking these imperfections, and the position the dice fall with respect to the others will probably give you enough randomness as to not be repeatable from throw-to-throw.

If you want to see if your dice favor numbers, get a tall cylinder, such as a large water bottle. Fill it with water, and drop in your die, then seal off the top. The die will sink to the bottom. Turn over the cylinder, let the die drop to the bottom, and record its number. You should do this at least 30 times for a decent sample size. In fact, the larger the sample size, the more accurate the results. Each number should come up 1/6 of the time, for that die. See Playing Fair with the Chi Square Test of Homogeneity for more information about testing fairness in dice.

However, there are "precision dice", or "casino quality dice" which are guaranteed to have each number equally as likely as the other six within a .0001 margin of error between any two numbers (in other words, if you threw the die 10,000 times, a favored number would come up 1 more time than another). If you live close to a casino, you can probably purchase used casino dice on the cheap. Even though the corners and edges will be slightly chamfered, thus beginning to show slight favoring in numbers, they are still likely more "fair" than your store-bought dice, and will probably continue to exhibit more fairness in each throw for a good long while.

If you search for "precision casino dice", you will find some listings on Amazon, eBay, and other locations. Most of these dice have razor edges and corners, meaning they are not rounded, and the corners are sharp. As such, the dice don't "roll" as well as dice purchased from a hobby store. They tend to land with a solid fall on the number. This also means they won't roll off your table when throwing them. Many of these dice will be transparent, so you can see the pip depth, and will also have a serial number printed on the die if purchased in a set. These dice are more expensive, but they will be the best choice in choosing fair 6-sided dice.

Rix Dice are precision machined metal dice, with chamfered corners and edges for better rolling. They'll last longer than acrylic or plastic dice, and the creator, Amber Rix, has paid attention to the dice being fair. They might be a consideration, if you plan on rolling them a lot. They are the most expensive precision dice I've come across, but probably the most durable too.

Conclusion

Diceware, and it's variants, can be a secure way for generating strong passwords that would withstand even the most sophisticated offline attacks on a hashed password database (which is the real threat). If you pay attention to entropy and randomization, then you'll create strong passwords that we qualified at the beginning of this post. As such, even the most powerful brute force searches on a hashed database, won't reveal your password. And Diceware makes it easy to remember your password, using only lowercase letters.

Creating Strong Passwords Without A Computer, Part 0 - Understanding Entropy

I've written a new series that investigates the art of creating very strong passwords without the aid of a computer. Sure, there are many software applications that will generate strong passwords for you, each with their own advantages and disadvantages. They're so plentiful, that it would be impossible to outline them all. However, generating passwords without the aid of a computer can be a very challenging, yet very rewarding process. It may even be impractical or impossible for you to install software on the computer you're using at the moment, when you need to generate a password.

Introduction

Before we start diving into passwords, we need to define what a "strong password" is. I've defined this many times on this blog, and as long as I keep blogging about passwords, I'll continue to define it. A "strong password" is one that is defined by entropy. The more the entropy, the stronger the password. Further, a "strong password" is one that is defined by true randomness, which means it was not influenced by a human being during the creation.

We're told that passwords must have the following characteristics when creating them:

  • Passwords should be long.
  • Passwords should contain both uppercase and lowercase letters.
  • Passwords should contain digits.
  • Passwords should contain special punctuation characters.
  • Passwords should be unique for every account.
  • Passwords should be easy to remember.
  • Passwords should not be written down.
  • Passwords should not be based on dictionary words.
  • Passwords should not contain birthdates, anniversaries, other other personally identifiable information.

These rules can be difficult, because remembering all these rules make passwords difficult to remember, especially if you have a unique password for every account. It's almost like creating a witches brew, just so you can create the perfect password potion:

Double, double toil and trouble;
Fire burn and cauldron bubble.
Fillet of a fenny snake,
In the cauldron boil and bake;
Eye of newt, and toe of frog,
Wool of bat, and tongue of dog,
Adder’s fork, and blind-worm’s sting,
Lizard’s leg, and howlet’s wing,
For a charm of powerful trouble,
Like a hell-broth boil and bubble.

Personally, I find it all a bit confusing, and even more annoying that security-conscious blogs endorse that garbage. I tend to keep things much more simple:

  • A password must contain great amounts of entropy (we'll quantify this in a minute).
  • A password must be truly random (no human interference).

So, what is entropy, and how do we quantify it? Let's begin.

Defining Entropy

Entropy can be found in many fields of study. Abstractly, entropy is defined as:

1. a thermodynamic quantity representing the unavailability of a system's thermal energy for conversion into mechanical work, often interpreted as the degree of disorder or randomness in the system.
2. lack of order or predictability; gradual decline into disorder. "a marketplace where entropy reigns supreme".

So entropy is a measure of disorder, or unpredictability. For our needs, we'll use entropy from Claude Shannon's Information Theory, a branch of Mathematics, which is defined mathematically as follows:

Definition of entropy.

where:

  • H = entropy in binary bits.
  • L = the number of symbols chosen for your message (length).
  • N = Total number of unique possibilities each symbol could be.

Or, if you would like to enter this on your calculator:

Entopy equation suitable for calculator entry.

This can be proven quite simply. First off, let us define the logarithm. As with standard mathematics, we'll define it as the inverse of the exponential function. In other words:

Definition of the logarithm.

Suppose we want to find the entropy size of of a 13-character password taking advantage of all 94 possible printable symbols on the ASCII keyboard. Then, if all 94 printable symbols are a valid choice for each of the 13 characters in my passphrase, then to find the total number of combinations that my password could be, we write it as:

logarithm2

Now, a property of logarithms is the ability to change base. Because entropy is defined in bits, or base-2, I'll change the base of my logarithm, as follows:

logarithm3

Rewritting the equation, we get the following result:

logarithm4

And thus, we've arrived at our conclusion.

This assumes that the message was chosen completely at random. However, if there is human intervention in creating the message, then the equation gets much more difficult. As such, for the point of this post, and most of my posts detailing entropy, we'll assume that the password or message was chosen completely at random.

Some Entropy Equation Examples

If you create a message from lowercase letters only, then the number of unique possibilities for each symbol is 26, as there are only 26 letters in the English alphabet. So, each character provides only 4.7 bits of entropy:

Entropy example equation 1.

If you created a message from lowercase letters, uppercase letters and digits, then the number of unique possibilities for each symbol is 62, as there are 26 unique lowercase letters, 26 unique uppercase letters, and 10 digits in the English language. So, each character would provide only 5.95 bits of entropy:

Entropy example equation 2.

A Needle In A Haystack

Knowing how much entropy is needed when creating a password can be tricky. Is 50-bits enough? Do I need 60-bits instead? 80-bits? 128? 256? More? In order to get a good firm grasp on quantifying entropy, I want to first create an analogy:

Entropy is to a haystack, as your password is to a needle.

Finding your password means searching through your entropy space. Your password should have a minimum amount of entropy. That means that the password will NOT be found in entropy spaces smaller than what your password is defined in. So if your haystack, or entropy, is 256-bits in size, your needle, or password, should not be found in 255-bit haystacks, or smaller.

To quantify this a bit, in a previous post, I demonstrated that a SHA1 hash has an output space of 61-bits due to cryptanalysis techniques, and it turns out that it was far too small. At the time of this writing, the bitcoin network is processing 2^61 SHA256 hashes every 76 seconds using specialized hardware called ASICs. The computing power is only going to grow also, as there is financial gain to be made from mining.

In other words, with these ASICs, 30 quadrillion pieces of hay can be analyzed every second. If your haystack has 2^61 pieces of hay, one of which is actually your needle, this haystack can be fully processed in 76 seconds flat.

How Entropy Relates To Passwords

If you continue using the same bitcoin network model for our speed benchmark, then at 30 quadrillion pieces of hay (passwords) every second, to completely exhaust the full output space, for the following haystack sizes, it would take:

  • 64-bits: 10 minutes.
  • 72-bits: 2 days.
  • 80-bits: 15 months.
  • 88-bits: 327 years.

In the past, I've shown that 72-bits of entropy (your haystack) seems to be sufficient for passwords (your needle) using the RSA 72-bit distributed computing project on distributed.net. After analyzing the bitcoin network mining operations, and how trivial it is to build specialized ASICs for these tasks, I'm beginning to think that you should have at least 80-bits of entropy for your passwords. As computing gets more and more stronger, that number will continue to increase.

As a result, to create a "strong password", I would recommend the following:

  • A password must contain at least 80-bits of entropy.
  • A password must be truly random.

Conclusion

Entropy is how we measure unpredictability. If we need any sort of unpredictability, such as finding passwords (needles in a haystack), then the more entropy we have, the better off we are. On computer systems, entropy is stored in what's called an "entropy pool". The larger the pool, the more reliably the operating system can generate true random numbers for security applications, such as GPG and long term SSL keys. The same can be said for passwords. The more entropy we have, the better off our passwords can become.

So, don't starve yourself by selecting weak 8-10 character passwords, and trying to generate random data in your head. Generate real random passwords and use large amounts of entropy.

The Reality of SHA1

Many people don't understand crypto. That's okay. I don't either. But, I do get math, and I want to show you something SIGNIFICANT that affects your everyday habits online.

It's been demonstrated that MD5 is broken. It's now trivial to find what are called "collisions". This is where two completely different inputs hash to the same output. Here is a demonstration: http://www.mscs.dal.ca/~selinger/md5collision/

SHA1 was meant to be a replacement for MD5. MD5 has an output space of only 128-bits, where as SHA1 has an output space of 160-bits. SHA1 is also designed differently than MD5, and is meant to not suffer the same sort of weaknesses or attacks that MD5 faces. However, over time, cryptographers have been able to severely attack SHA1, and as a result, they've all been warning us to get off SHA1, and move to SHA2. It should take 2^160 operations to find a collision with SHA1, however using the Birthday Paradox, we can have a probability of 50% of finding a SHA1 collision in about 2^80 operations. However, cryptanalysists have torn down SHA1 to a complexity of only 2^61 operations. Even better.

An output size of only 61-bits is small. For comparison sake, the distributed computing project http://distributed.net cracked the 64-bit RSA key in just over 5 years at a snails pace of 102 billion keys per second: http://stats.distributed.net/projects.php?project_id=5. The motivation was prompted by a $10,000 dollar award from RSA labratories to find the secret key encrypting a message.

Granted, you don't need to search the entire 64-bit keyspace to find the key. It's just as likely you'll find the key immediately at the start of your search, as it is to find the key at the end of your search. But it shows how reachable 64-bits is. The reduced search space of 61-bits from SHA1's attack vector is 8x smaller in search space than the 64-bits of that RSA secret key challenge. So, at 102 billion hashes per second, it's reasonable to conclude that you could exhaust the 61-bit search space somewhere between 6 and 9 months.

This is far too slow. Let's look at another comparison. Bitcoin.

First, Bitcoin uses SHA256 rather than SHA1 as part of its mining algorithms. According to http://blockchain.info/charts/hash-rate, the Bitcoin network is processing about 30 million gigahashes per second. That's 30 quadrillion hashes per second. The size of 61-bits is a mere 2.3 quintillion hashes. Let's do some math:

2.3 quintillion hashes divided by 30 quadrillion hashes per second = 76 seconds.

What does this mean? The Bitcoin network is currently working over 2^61 SHA256 hashes every minute and 16 seconds. If this were SHA1, we could brute force 1,150 SHA1 collisions every day.

Why should you care? Because when you connect to a server with SSH, SHA1 is likely presented as the hash. When you use your browser to connect to an HTTPS site using SSL, SHA1 is lkely presented as the hash. When you encrypt something with OpenPGP, or cryptographically sign a key or document, SHA1 is likely presented as the hash.

It's the case that most of your encrypted communication online is using SHA1 in one way or another. And yet, it's well within our reach to process 1,150 SHA1 collisions EVERY. SINGLE. DAY. It's long over due that we replace SHA1 with SHA2 or SHA3. Not because some theoretical proof says so, but because we can actually do it.