I hang out in ##crypto in Freenode, and every now and then, someone will ask about the security of multiple encryption, usually with the context that AES could be broken in the near future. When talking about multiple encryption, they are usually referring to cascade encryption which has the form of:
CT = Alg_B(Alg_A(M, key_A), key_B)
The discussion revolves around the differences between "Alg_A" and "Alg_B". Such as using AES for "Alg_A" and Camellia for "Alg_B". Also, the discussion will include whether or not "key_A" and "key_B" should be the same key, or different.
Cascade encryption is more efficient in storage space than some alternatives, such as this one suggested by Bruce Schneier:
CT = Alg_A(OTP, key_A) || Alg_B(XOR(M, OTP), key_B), where OTP is a true one-time pad
I'm not going to go into the theoretical concerns with multiple encryption. However, I would like to cover some practical considerations:
- Multiple key security.
- Long-term storage.
- Complexity.
- Host security.
Multiple key security
It should come as no surprise that when dealing with multiple encryption, that you are going to be dealing with multiple keys, if you choose to keep "key_A" and "key_B" separate. Probably the most difficult aspect of encryption implementations, is keeping the secret key secret. For example, key exchanges between machines over the scary Internet has been notoriously difficult to get correct. Current best practice is implementing authenticated ephemeral elliptic-curve Diffie-Hellman (ECDHE) when communicating secret symmetric keys between machines. So, not only do you need to communicate one key, but multiple keys when encrypting and decrypting data.
If the multiple-encrypted data is to be stored on disk, then keys will need to be retrieved for later. How are these stored? This isn't an easy question to answer. If you store them in a password manager, they are likely just getting single-encrypted, probably with AES. So, the security of your ciphertext rests on the security of your stored keys, likely protected by the very algorithm you are trying to safe-guard.
Now, you could use the same key for every encryption layer. But, this poses a theoretical concern (which I promised I wouldn't cover- sorry). If the same key is used for every layer, then if an attacker can recover the key through cryptanalysis of the first encryption layer, then the attacker could possibly decrypt all remaining layers. Obviously, you don't want to use ciphers where the decryption process is exactly the same as the encryption process. Otherwise, the second encryption process on the ciphertext would decrypt the first encryption! While not probable, this last scenario could even occur with different algorithms, such as AES and Camellia. So, it seems at least at a cursory glance, that using the same key for all encryption layers probably is not a wise idea. So, we're back to key management, which is the bane of cryptographers everywhere.
Long-term storage
In my opinion, a larger problem is that of storage. It's one thing to get multiple encryption correct and safe on the wire, it's another to place value on long-term data storage. Think about it for a second- what is the longest you have kept data on the same drive? In personal scenarios, I have some friends that have had personal backups for up to five years. To me, this is impressive. It's likely more common that data switches drives every couple of years. RAID arrays die, hardware is replaced, higher drive capacity is demanded, or even bit rot creeps in, destroying data (such as on magnetic or optical mediums). When push comes to shove, the encrypted data is just going to move from drive-to-drive. But, ask yourself this next question- what is the oldest data you have in your possession right now?
Let's be realistic here for a second. I would be hard-pressed to find data I stored back in 2000, 15 years ago. I could find some photos in photo albums, on mugs, and on Christmas cards, but I'm not 100% confident I could get the digital original. Despite my best efforts, accidents happen, mistakes are made, and data is just lost. I don't think I'm alone here. I've even worked for companies, with large budgets, that had a hard time recovering data that is 10+ years old. For one, it's expensive to hold on to data indefinitely, but a great amount of data also becomes less and less valuable as time progresses. Yes, I still use my same email account from 2004- Google has done a great job of keeping all of my emails these past 11 years, and I would expect other data service providers to do the same. But, how many of you have kept an email address for 10+ years? Or even the data, for that matter? (This blog is actually 11 years old as well- kudos to me on keeping the data going this long).
My point is, hardware fails and is changed. Your personal value on data also changes, and accidents happen. So you're concerned about AES being broken in 20 years, or even sooner. Do you think by that time you'll still place value on that encrypted data? Do you think you'll even still have access to it, or can find it? And, if so, will it really be that difficult to decrypt the AES data, and encrypt it with the current best practice encryption algorithm?
Complexity
This is probably the problem you should be concerned with the most. As a collective group, we as developers have a hard time getting single encryption correct, let alone multiple encryption. This deeply enters the theoretical realm, which I promised I wouldn't blog about. But, you do have a practical concern as well- order of operations and correct implementations.
First, order of operations. It's one thing to do "double encryption", where only two algorithms are chosen and used. If you can't recall if you used AES first or second, it's a 50/50 shot at getting the order correct (provided you know which key belongs to which algorithm, otherwise it's a one-in-four chance). Imagine however, using three encryption layers, and lining the keys up correctly. Imagine the complexity of four layers, or more. Ugh. Seems like you certainly don't want to go higher than two layers.
Second, look at implementations. AES is AES. It shouldn't matter what algorithm does the calculations. But, implementations like to put "magic bytes" at the beginning of ciphertexts (OpenSSL, OpenPGP, etc.). This data is only valuable for that implementation, and even worst, for a specific subset of versions. Just imagine encrypting a file with OpenSSL version 1.0 now, and needing to decrypt it in 10 years. Will OpenSSL version X be able to read those magic bytes, and correctly decrypt the file? Or will it error out, unable to decrypt the data because the data structure of the magic bytes changed in that 10 year time frame?
So, it seems best to encrypt it with some programming language library, where you can control exactly what data is stored. But, as everyone will tell you while frothing at the mouth, "don't roll your own crypto". Technically, you aren't if you "import aes" and use the "aes" module provided by that language correctly. It just remains to be seen if you implemented it correctly to thwart an attacker. Crypto is hard and full of sharp edges. It's very difficult to get things right, without getting cut. Regardless, while the "aes" module might be available in 10 years, what about the "camellia" module, or whatever algorithm you chose for the second layer? Is it still in development, or was it abandoned due to either being broken, or lack of development? Can you find that module, so you can decrypt your data?
Host Security
In a more practical real-world, everyday person scenario, how secure is the host that is doing the multiple encryption? Do others have physical access to the machine? Is it free of viruses, malware, and other badware? Does the system run an encrypted filesystem? Where and how are backups stored? Who has access to those backups? So many more questions can be asked that judge the quality of the security level of the host storing or processing the data.
Viruses and malware would probably be my number one concern if the data was so valuable, as to be multiple-encrypted. So, I would probably encrypt the plaintext on one machine, encrypt the ciphertext on a second machine, and store it on a third machine, preferably air-gapped. Thus, if a virus exists on one machine, hopefully it doesn't exist on another, and hopefully it doesn't attach itself to my encrypted data, and hopefully the badware didn't report my plaintext to a botnet pre-encryption.
Physical host security is hard. People have crappy passwords protecting their workstations. Physical access can get the attacker root regardless. Systems are infected with badware all the time, just by visiting websites! So there is hardly a guarantee that your data is safe, even though it was encrypted multiple times with different keys and algorithms.
A Couple Thoughts
It hardly seems worth the effort to encrypt your data multiple times with different algorithms and different keys, provided the overhead necessary in managing everything (hardware and software). Further, in reality, modern encryption algorithms aren't usually broken. For example, DES as an algorithm, isn't broken- it just requires a small key space. So, encrypting your data multiple times is solving a problem that for the most part, just doesn't exist.
That's not to say that AES will remain secure in 10, 20, or 40 years. I'm not that naive. But, as a user, you do have the ability to switch algorithms when AES does break. So, decrypt your AES ciphertext, and encrypt it with SevenFish (sorry Bruce- bad joke). Keep it encrypted with SevenFish until that breaks, and then decrypt it, and encrypt it with whatever the new modern cipher is at the time (if you still have the data, it's still valuable to you, and all implementations can still work with the ciphertext).
Conclusion
In my opinion, don't worry about multiple encryption. Generate a GnuPG key pair, encrypt your data once, and be done with it.
Post a Comment