Image of the glider from the Game of Life by John Conway
Skip to content

Officially Announcing d-note Version 1.0

I've been looking forward to this post. Finally, on my birthday, it's here. My Python Flask web application of encrypted self-destructing notes is stable, and ready for production use.

History
Around 2011, or so, I started thinking about a way that I could send data privately and securely to friends, family and coworkers, without requiring them to spend a great deal of time setting up PGP keys, knowing a great deal about encryption and security, and other things. Basically, I wanted a way to send them an encrypted note, and transparently, it could be decrypted with very little work, or at most, providing a passphrase to decrypt the note. I knew this would need to be a web application, but I wasn't sure of the implementation details. Eventually, that got worked out, and d-note was born early January of this year.

Shortly after that release, I started focusing on a script that would generate random ASCII art with OpenPGP keys. There will be some additional news about that project I'm excited about, but will announce at a later date. As a result of that OpenPGP project, d-note development took the back seat for a bit. However, a recent pull request from Alan Dawson brought focus back to the code.

Changes And Increased Security
The changes that Alan Dawson introduced were switching away from Blowfish in ECB mode to AES-128 in CBC mode with HMAC-SHA1. This is something that I initially wanted to support, but didn't for two reasons:

  • Without thinking about PBKDF2, AES keys must be 16, 24 or 32 bytes in size. This prevented users from entering personal passwords to encrypt the note, rather than the server-side key.
  • Also, without thinking about PBKDF2, Blowfish was intentionally designed to make the key setup slow, so brute forcing the password out of the encrypted file would be more difficult.

However, Blowfish uses a 64-bit block size for its internal operations, while AES uses 128-bit. This might be of little consequence for the standard plaintext encrypted note, but for encrypting files, which I would like to support in the long term, it could mean the difference of repeated encrypted blocks in Blowfish, or none in AES.

Since the changes introduced by Alan, I've increased the security of the application to AES-256 in CTR with HMAC-SHA512. The application relies on the Python Cryptography Toolkit version 2.6. In version 2.7, new authenticated block cipher modes are introduced, such as GCM. With the ability to use GCM, the need to create an HMAC-SHA512 tag will no longer be needed. However, switching modes will break previously encrypted notes that have not yet been decrypted, so this will need to be handled with care. Also, when SHA3 is standardized, I would like to switch to that if it is introduced into PyCrypto, rather than use SHA512, even though there have been no serious security concerns over SHA2.

The Encryption Process Visualized Without A Custom Passphrase
I wanted to at least show what the encryption process looks like from top-to-bottom visually, so you wouldn't need to piece together the code and figure it out.

First, we start out with the application creating 3 static salts at random. Each salt is 16-bytes in size, and should be different from the other two, although this isn't a requirement. The salts will remain static as long as you wish. They are the only random data that is not deleted, but instead saved on the server. Changing the salts will mean any previously encrypted notes still on the server will no longer be able to be decrypted. As such, once generated, they should be left alone. However, if no encrypted notes remain on the server to be decrypted, the web server could be stopped, the salts changed, and the web server started back up. The only reason you would do this, is if you feel the salts have been compromised in some way.

3-static-salts

When a browser renders the main index.html page, we need to create a unique and random URL to post to. As such, a one-way function is used to generate three random keys, from which we'll build everything from. First, we generate a random 16-byte nonce. This nonce is the key to starting the one-way function to build everything for encrypting and decrypting the note.

random-nonce

All of the security of this application rests on the ability to generate a cryptographically secure random nonce on every page view. We've taken the steps necessary to ensure that not only is that choice cryptographically strong, but all the building blocks that build from it are industry best practices, and over engineered at that. Our one-way function is started by using the salts along with our nonce in a PBKDF2 function. PBKDF2 is a password-based key derivation function that can derive a pseudorandom key of any size, given a nonce and a salt. As such, we use our three salts and the nonce to generate a 16-byte file key, a 32-byte AES key, and a 64-byte HMAC key. Notice that we use each salt only once.

pbkdf2

Now that we have a file key, we can base-64 encode that data, which becomes our file name to save our encrypted data to. This is different than initially released, where the URL would also be the file name. The URL can produce the file name, but the reverse is not true.

filename

Finally, our nonce is base-64 encoded, and becomes the URL that we post to, and the URL that we give to the recipient.

url-creation

At this point, the user is now copying their data into the form. The user has the option of either using a personal passphrase to encrypt the note, or to let the server use its key. It's important to note, that due to our nonce above, unless the same passphrase is chosen for encrypting multiple notes, the same AES key is never used to encrypt multiple notes. Because this is symmetric encryption server-side, we don't want to run the risk of deriving the AES key and having multiple encrypted files decrypted, because they were encrypted with the same key. So the random nonce generation at each index.html view is critical.

When the note is posted to the server, it is first compressed using the ZLIB compression library. This reduces data structure out of the plaintext, and increases the overall entropy of the encrypted note before it is finally saved to disk. It should be mentioned that the note is never stored to disk until it is fully encrypted.

plaintext-compression

We are now ready to ship the compressed plaintext off to AES for encryption processing. First, we need to generate a 12-byte random initial value for our counting. We need to do this, because we will be encrypting the note with AES-256 in CTR mode, and I would like to protect the end users from backups. CTR mode uses a counter, that typically starts with "1", for ensuring that the same plaintext block is not encrypted to the same ciphertext block during the encryption process. However, if the same AES key is shared between two plaintexts, then an XOR of the two ciphertexts can reveal the AES key. As such, for every encryption process, we use that random 12-byte initial value to practically guarantee that the starting point of the AES counter is always different, even if the same AES key is used between multiple plaintexts. The initial value is 12-bytes in size, rather than 16-bytes, to ensure that we still have plenty of counting space during the encryption process.

We also need our pseudorandom 32-byte AES key that we generated from our earlier PBKDF2 function. Using this 32-byte key, our 12-byte initial value, and our compressed plaintext, we encrypt the data. This encryption process will give us a preliminary ciphertext. It is preliminary, in that we have additional work to do, before we are ready to store it on disk.

aes-encryption

Because we used an initial value to start our AES-256 in CTR mode, we will require the initial value also for decryption. As such, we need to store the initial value with the preliminary ciphertext. As such, the initial value is prepended at the beginning of the ciphertext. Because the initial value is random data, and the ciphertext should appear as random data, prepending the initial value to the ciphertext should not reveal any data boundaries or leak any information about the stored contents. Prepending the initial value to the ciphertext gives us a file 12-bytes larger as an intermediate ciphertext.

iv-prepend

The final step in the encryption process is to ensure data integrity and to provide authentication by using HMAC-SHA512. The reason for this, is non-authenticated block cipher modes can suffer from a practical malleability attack on the ciphertext to reveal the plaintext. Thus, the intermediate ciphertext is HMAC-SHA512 hashed. This is known as "encrypt-then-MAC" (EtM), and is the preferred way for storing MAC tags.

HMAC allows you to choose a number of different cryptographic hashing algorithms. In our case, SHA512 is used, because we can. HMAC requires both a salt and a message. In our case, because the user is not providing a passphrase to the application, the salt is our 64-byte HMAC pseudorandom key that we generated with PBKDF2 earlier. The intermediate ciphertext and this 64-byte key are added to the HMAC-512 function, to generate a 64-byte SHA512 tag.

hmac-sha512

This 64-byte SHA512 tag is prepended to our intermediate ciphertext to produce our final ciphertext document, which is finally written to disk in binary form. Initially, I had base-64 encoded the ciphertext, as is standard practice, but this just increases the used disk space by an additional 30%, and adds additional code, with no real obvious benefit, other than not being able to view the ciphertext in a text editor, or on your terminal. So, the raw binary is stored instead.

hmac-prepend

The Encryption Process Visualized With A Custom Passphrase
In the process of encrypting the note with a user-supplied passphrase, rather than use our cryptographic nonce to generate the AES key and the HMAC key, the passphrase is used in place. In other words, we still use PBKDF2 to generate our AES key and our HMAC key, combined with the appropriate salt. Everything else about the encryption process is the same.

user-passphrase

Knowing whether or not a user supplied a passphrase to encrypt the note, an empty file with ".key" as an extension is created. The user-supplied password is not stored on disk. The empty file exists only to tell the application not to use PBKDF2 to generate the AES key, but to ask the client for it. However, if a duress key for Big Brother is supplied, then that key is indeed stored on disk. That key will not decrypt the note, but instead return random sentences from The Zen of Python, as well as destroy the original note.

To show the effectiveness of the 12-byte initial value discussed earlier, I'll encrypt the text "Hello Central, give me Doctor Jazz!" twice, both with the same passphrase "Oliver". If the AES counter started with the same value in both instances, then the ciphertext will be identical, as it should be. By starting with a random initial value for our counter, the ciphertexts should be different. Let's take a look.

Encrypting it twice, I ended up with two files "qZ-cloUfLnFYVYZImgYWmQ" and "hxlUK9GxbNSMgJZDUgEsMw". Let's look at their content:

$ ls data
hashcash.db  qZ-cloUfLnFYVYZImgYWmQ  qZ-cloUfLnFYVYZImgYWmQ.key hxlUK9GxbNSMgJZDUgEsMw  hxlUK9GxbNSMgJZDUgEsMw.key
$ cat data/qZ-cloUfLnFYVYZImgYWmQ | base64 -
CgHOJ4NTTMSNWOVbC1SduXg7jqij92o3ZHhOcCumu8DKmDKI3cyZ5Ne9dB6w7TrnxMaR8EyDf5w9
eMpMM+VLkztyfAieUWFYVFvFqJ0cY7+fgX1tzyaLs71rHywjydE9bobAfof0Bqo6j87sTJEgJTu+
BFXko1w=
$ cat data/hxlUK9GxbNSMgJZDUgEsMw | base64 -
D/fLA+UgghHQMGadJxmATifaL+JTXybRNROUDvSBzgTV6EWX7Dau9Z9zI1KpEuMbDeFNp4oZk4SN
hlVGm2k8vWwKJWhys78U6FFd1jGmKWDC61VKH7+zXEGhNf/fo/igEEEG+Jge1awQ9A0cJbsLSmoh
zgj6XH8=

This should not be surprising, knowing how CTR mode works with block ciphers:

ctr-encryption

As long as that IV is different, then the ciphertexts should be different, and one should not be able to extract the private key that encrypted the plaintext, even if the key is reused across multiple plaintexts. Let's see what the IV was for each, using Python:

>>> msg = """CgHOJ4NTTMSNWOVbC1SduXg7jqij92o3ZHhOcCumu8DKmDKI3cyZ5Ne9dB6w7TrnxMaR8EyDf5w9
... eMpMM+VLkztyfAieUWFYVFvFqJ0cY7+fgX1tzyaLs71rHywjydE9bobAfof0Bqo6j87sTJEgJTu+
... BFXko1w="""
>>> msg = msg.decode('base64')
>>> data = msg[64:] # remember, the first 64-bytes are the SHA512 tag
>>> iv = data[:12]
>>> long(iv.encode('hex'),16)
18398018855321261569467991464L
>>> msg = """D/fLA+UgghHQMGadJxmATifaL+JTXybRNROUDvSBzgTV6EWX7Dau9Z9zI1KpEuMbDeFNp4oZk4SN
... hlVGm2k8vWwKJWhys78U6FFd1jGmKWDC61VKH7+zXEGhNf/fo/igEEEG+Jge1awQ9A0cJbsLSmoh
... zgj6XH8="""
>>> msg = msg.decode('base64')
>>> data = msg[64:]
>>> iv = data[:12]
>>> long(iv.encode('hex'),16)
21167414188217590509613634382L

Decryption
Not much really needs to be said about decrypting the encrypted notes, other than because we are using authenticated encryption with HMAC-SHA512, when client supplies the URL to decrypt a note, a SHA512 tag is generated dynamically based on the encrypted file and then compared to the SHA512 tag actually stored in the encrypted note. If the tags match, the plaintext is returned. If the tags do not match, something went wrong, and the client is redirected to a standard 404 error. Other than that, the code for decrypting the notes should be self-explanatory.

Conclusion
I hope you enjoy the self-destructing encrypted notes web application! If there are any issues or concerns, please be sure to let me know.

{ 1 } Comments