So, due to the bad shape of the economy I was let go from my position as a Linux instructor and guru. As unfortunate as it is, I have to press forward looking for the best job that fits my skill set.
However, the point of this post isn't to complain about being laid off, or the bad shape of the economy. Rather, while employed, I was given a 120 GB LaCie Rugged hard drive that was called the "gurudisk" (being a "Linux Guru" from "Guru Labs". Get it?). The gurudisk had everything on it necessary for easing the installation of Linux on computers. Kickstart and AutoYast files were used for automating the install of the instructor machine, while scripts and RPMs were used to automate the configuration and additional software installation of the instructor machine, and DHCP, DNS, TFTP and PXE, along with Kickstart and AutoYast files, were used for automating student machines. Using the gurudisk, I could do a full classroom install, complete with instructor machine and 20 student machines, in under an hour. The gurudisk held RHEL 5, RHEL 5.1, Fedora 6, SLES 10, SUSE Linux 10.1 and Oracle 4.5 disk ISOs and software, as well as RPMs, scripts and config files. It was truly a welcomed companion.
However, all of that can easily fit in 40 GB of space, so what to do with the rest of the 80 GB? Well, most of us began using that space for personal data. Music, videos, scripts, documents and so forth. I'm not one to carry music or movies with me, so that didn't interest me much. Rather, I wanted the ability to take the gurudisk further with using Ubuntu and Debian. So, I had an "isos" directory on my gurudisk, where I kept more updated ISOs, including RHEL 5.2, Fedora 9 and 10, Ubuntu 8.04 LTS, Debian 4.0, openSUSE 11, OpenSolaris, FreeBSD, OpenBSD, and others. At one point, I had an entire Ubuntu repository mirroring 8.04 and 8.10 on the gurudisk. Lastly, if that's not enough, I had VMWare, KVM, Xen and VirtualBox virtual machines with clean, vanilla installations of a few of the major distributions. I took advantage of my space, and it also came to the welcome approval of many students.
When news came yesterday that I had lost my job, and that I would need to turn in my gurudisk, I wanted to first get my Ubuntu mirror, virtual machines and ISOs off the disk. Then, I wanted to experience, first hand, "shredding" the data on the disk. Thus, we have now reached the topic of this post- GNU shred.
I had heard from students over and over again that zeroing out the drive using /dev/zero is not sufficient for secure data deletion. I full heartedly disagree, and I'm sure I'll bring out the emotion of many of you in the comments. Here's why I think /dev/zero is more than sufficient for secure data destruction:
- On older dive encoding schemes, mainly RLL and MFM, data was not written in exactly the same spot every time. As such, there was left over charge from the previous write, and expensive data recovery hardware could use math and averages to discover what the data once was. As such, a method known as the "Gutmann Method" became the standard of destroying data. Patters of ones and zeros would be written to the disk, in such a way that maximizing flipping the bits, minimizing the average left over charge. After seven passes, the residual charge would be so minute, that it would be virtually impossible to recover the data. Do 35 passes, and the data is gone for sure.
- Drives today do not use RLL or MFM encoding, and also, the bits are much more close together then they were in days gone by. The data has to be written in exactly the same spot, or data destruction is likely on other data existing on the disk. As such, there is no left over residual charge from rewriting the data. A single pass over existing data removes any existence of that data.
- Supposedly, top secret, mega government, super computers administered my corporations with endless amounts of cash flow can recover data on ATA drives, even after seven passes. However, there has been no academic study, no scientific evidence, no hard cold proof. All we have is hearsay and rumors of people we know claimed to recover the data using these killer machines or algorithms on ATA drives.
So, with that in mind, after backing up my data, a single zeroing of the entire drive would be more than sufficient for a couple reasons. First, my bosses and company don't have the resources, the time and money, or the care to recover any data off of my gurudisk. Second, the data I was deleting wasn't necessarily personal, as no passwords or private keys or information was stored on the disk. So, even if the data could be recovered, of what use would it be to anyone? Little, if any. Chances are good that the drive will sit on a shelf, unattended and unused, and when it does make it back into commission, it will just be formatted with ext3, files put on, and used as any other drive. So, /dev/zero it is.
I'm a mathematician at heart. I love math and logic puzzles as well as cryptography and many an algorithm. If I had the time and money, I would finish college, and get a Doctorate in Mathematics. However, that's a dream that just isn't realistic at this point in my life, but I still enjoy pulling out my HP49G+, and crunching the numbers. So, the algorithm used in Gutmann's Method is interesting. More interesting are the pseudo-random number generators used in cryptographic applications. So, I decided to give GNU shred a try, seeing as though it's part of coreutils, and see what the result is. I ran the following command:
shred -v -z /dev/sdc
This means that GNU shred will make 26 total passes, with the 26th pass being straight zeros to hide the fact that the disk has been shredded. Once finished, I'll add one final pass as an easter egg to the next person who gets the gurudisk. So, 27 total passes to the disk. What I'm mostly interested in, is the time it will take to finish. From my understanding, it will write pseudo-random numbers to the disk on the first, middle and second-to-last passes, due to passing '-z' to zero the shred. Writing random data to 120 GB of disk is going to take some time. In fact, I timed it, and it took 5 hours and 20 minutes. Which means it will last at least 16 hours to run to completion. But then there is the one and zero pattern writing that will take place in between. I would expect this to go substantially faster than writing random data, and it does- about three times as fast. Three passes can be completed in 5 hours and 20 minutes, give or take, based on the pattern. There are 23 final passes at this rate, which is approximately 41 hours. Add the 16 on top of that, and it's going to take 58 total hours to complete all 26 passes. That's almost 2 and a half days! In fact, as I'm writing this, it's 18:00 the next day, and I'm only on pass 11, writing the pattern "333333" in hexadecimal to the disk. The next pass will be my second random data pass. When I get out of bet tomorrow, I expect to be on pass 18, give or take.
I figure, even though I'm long past any possible data retrieval, it's fun to watch. Even more entertaining is the heat emanating off of the disk- it's fairly warm, which I guess makes sense, as the disks have been going non-stop for almost 24 hours. Would I recommend GNU shred for wiping your data? No. Again, /dev/zero will be more than sufficient, and fast too, at roughly 30 MB per second on a SATA or USB 2.0 disk. Which, by the way, this disk is connected via FireWire 400 (I'm not a fan of the USB speed burst). I'd love to see this run to completion, but I'll probably cancel it sometime tomorrow morning, install my easter egg, then be on my way to return the disk.
Long live hacking!