Image of the glider from the Game of Life by John Conway
Skip to content

Scrubbing Hard Disk Data

I've recently had the opportunity with wiping 13 SCSI drives. The drives are small- 36 and 18 gigabyte drives, and they do contain sensitive data. They will be sent off to a third party for physical destruction, but we need to make sure that the data is completely overwritten on the disk in a secure manner. This means using a utility that can overwrite bit-for-bit on the disk level. Fortunately, there are many utilities for making this possible.

The most popular of these, is DBAN, or Darik's Boot and Nuke. It comes as a CD or USB image that you boot from, rather than the disk, then choose in a menu which wiping method you wish to choose. Of the choices, there are:

  • Quick Erase- One pass, writing nothing but zeroes.
  • RCMP TSSIT OPS-II- Eight passes using random writes and compliments on each pass.
  • DoD Short- Three pass version of the stronger seven pass below. Each pass is random data written.
  • DoD 5220.22-M- Sever passes using random data at each pass.
  • Gutmann Wipe- 35 passes across the hard drive as described by security expert Peter Gutmann and Colin Plumb.
  • PRNG- Arbitrary number of passes specified by the user using a pseudo random number generator for writing random data on each pass.

For most secure scrubbing purposes, a quick erase is more than good enough. There have been no published papers to date on recovering overwritten date after a single pass. Is that to say it's not possible? No, of course not. For what it's worth, all the drives that leave my possession only get a single pass. However, if you or or organization is more paranoid about getting the data off the platters, there are other options available that will do more passes on the drive.

The next option in the DBAN menu is the RCMP TSSIT OPS-II wipe. This pass uses a source for a pseudo-random number generator as the first pass, then produces the compliment of that first pass as the data for the second. The idea behind this method is switch the bit on the disk platter from one to zero as often as possible. By using a random source for the initial pass, then writing the compliment, we've successfully written two passes on disk. At this point, it should be "good enough" for even the most seasoned data recovery company. However, this pass does that dance three more times, for a total of eight passes.

The Department of Defense, in the United States of America, has established a standard for sanitizing disks that contain TOP SECRET data. They have two standards. The first is the "DoD Short" wipe. This is a short three pass wipe. Nothing fancy about it. Each pass uses a pseudo-random number generator as the source for the overwriting data, and makes three passes with this source. The "DoD 5220.22-M" is the more secure DoD sanitization method, which uses seven passes across the disk instead of three. Each pass uses a pseudo-random number generator for the source of the data.

The next method is for the ultra-paranoid company or individual. This wipe is known as the "Gutmann Wipe", and it's built to take advantage of different hard disk encoding mechanisms. Essentially, there are two main encoding schemes for storing the data on your disk: MFM and RLL. All modern drives today use the RLL encoding scheme. Essentially, RLL is a lossless compression encoding scheme, making it possible to fit more data on the disk platters. Because MFM and RLL store data differently on the drive, using a certain method might be optimized for MFM encoded drives, but won't work well with RLL and vice-versa.

The method behind calculating the data to the disk is rather simple: generate a unique list of one-bit numbers (zeros and ones), then two-bit numbers, then a three-bit numbers, then finally four-bit numbers uniquely. After this list of numbers has been generated, begin writing. This list is as defined in hexadecimal:

  1. 1-bit: 0x000, 0xFFF
  2. 2-bit: 0x555, 0xAAA
  3. 3-bit: 0x249, 0x492, 0x942, 0x6DB, 0xB6D, 0xDB6
  4. 4-bit: 0x111, 0x222, 0x333, 0x444, 0x666, 0x777, 0x888, 0x999, 0xBBB, 0xCCC, 0xDDD, 0xEEE

If you want to convert this list to binary, then think about it in terms of the "number of bits". For example, with one bit, you only have two options: a zero or a one. With two bits, you have a possible combination of 4 numbers: all zeroes, all ones, zero then one or one then zero. Because we've already defined "all zeroes" and "all ones" in the one-bit number, we don't need to repeat them in the 2-bit, 3-bit or 4-bit representation. Now, why repeating that bit 3 times? Well, the least common denominator of three and four is twelve. The idea is that I'm writing patterns, not necessarily static data. So, the pattern needs to repeat through the 12-bit number. For example, take the 4-bit number

0x999

What is this in a 12-bit binary representation? Isn't it:

100110011001

or if you were to separate it out:

1001 1001 1001

Do you see the pattern of two ones followed by two zeroes, followed by two ones followed by two zeroes, etc? That's the idea. Writing patterns to the disk.

So, how do we put all these numbers together, so we can sanitize the data securely for both RLL and MFM drives? Wikipedia has a good article on it, and explains that the first and last four writes are random data from a secure random number generator. Then, at pass five through pass 31, we use the 1-bit through 4-bit numbers we came up with, and begin writing, some of them used two or three times, based on the drive encoding scheme it's targeting.

Lastly, if this isn't enough, you have one last option, where you can specify the number of passes for wiping the data. The pseudo-random number generator that is used for the other passes is chosen here, and each pass writes random data to the disk.

This is a great utility for sanitizing disks, however, I've found DBAN to be spotty on certain hardware configurations. For one, it's x86-based only, which means you won't be able to boot this on Sparc or HPPA-RISC hardware. Also, even on some x86-based hardware, I've found DBAN to hardlock, not ever getting to the menu for me to begin wiping. So, what can I do? Am I up a creek without a paddle? Most definitely not!

KNOPPIX is a solid LiveCD that loads the Linux kernel and the Debian user-space utilities, giving you a live desktop, complete with all the tools you would need for rescuing and wiping machines. KNOPPIX has been soundly tested against a vast array of hardware, and it sees very active development with a vibrant community behind it. How can KNOPPIX securely delete the data off your drives? Well, GNU Shred from the GNU Coreutils package is a flexible package for choosing the number of passes against a drive. Because you've booted into a live Linux environment, you also have /dev/zero, /dev/random and /dev/urandom as a source of endless data for sending to your drives. In my specific situation of wiping the 13 SCSI drives, I booted into a KNOPPIX CD, executed 'shred' and told it to do three passes, then one last pass of zeroes, hiding any evidence of data sanitization. Many other GNU/Linux distributions provide live environments (CD or USB) that you could take advantage of. Ubuntu, openSUSE, Debian and Fedora are just a few worth mentioning.

Of course, if you're running an encrypted filesystem worth its salt, then there really is no practical reason for scrubbing the data off your drives, and the encrypted representation of your data doesn't mean squat without the private key to that data.

{ 11 } Comments