views:

1181

answers:

12

Erasing programs such as Eraser recommend overwriting data maybe 36 times.

As I understand it all data is stored on a hard drive as 1s or 0s.

If an overwrite of random 1s and 0s is carried out once over the whole file then why isn't that enough to remove all traces of the original file?

+1  A: 

There are "disk repair" type applications and services that can still read data off a hard drive even after it's been formatted, so simply overwriting with random 1s and 0s one time isn't sufficient if you really need to securely erase something.

I would say that for the average user, this is more than sufficient, but if you are in a high-security environment (government, military, etc.) then you need a much higher level of "delete" that can pretty effectively guarantee that no data will be recoverable from the drive.

Scott Dorman
Formatting is different then wiping the first. Formatting just over write the file allocation table (sometimes only the first copy and not redundant copies.) This is why most people (and undelete/unformat) can recover files.
Matthew Whited
That's something i didn't know, thanks
kjack
+13  A: 

A hard drive bit which used to be a 0, and is then changed to a '1', has a slightly weaker magnetic field than one which used to be a 1 and was then written to 1 again. With sensitive equipment the previous contents of each bit can be discerned with a reasonable degree of accuracy, by measuring the slight variances in strength. The result won't be exactly correct and there will be errors, but a good portion of the previous contents can be retrieved.

By the time you've scribbled over the bits 35 times, it is effectively impossible to discern what used to be there.

Edit: A modern analysis shows that a single overwritten bit can be recovered with only 56% accuracy. Trying to recover an entire byte is only accurate 0.97% of the time. So I was just repeating an urban legend. Overwriting multiple times might have been necessary when working with floppy disks or some other medium, but hard disks do not need it.

DGentry
Thanks, that's explained something that didn't seem to make sense to me before
kjack
I'm actually not convinced this is correct information. There are no recorded instances of any recovery from an overwrite that yielded more than 1% of valid data.
Bruce the Hoon
Government's with lots of money don't tend to publish their results.
Adam Davis
Let me leave a better comment. _Researchers_ have shown that it's possible to recover significant information from a magnetic disc that has been overwritten more than once. That doesn't mean it's easy, or that it happens regularly, but that _it's possible_, so it's worthwhile to take extra care.
Adam Davis
+3  A: 

In conventional terms, when a one is written to disk the media records a one, and when a zero is written the media records a zero. However the actual effect is closer to obtaining a 0.95 when a zero is overwritten with a one, and a 1.05 when a one is overwritten with a one. Normal disk circuitry is set up so that both these values are read as ones, but using specialised circuitry it is possible to work out what previous "layers" contained. The recovery of at least one or two layers of overwritten data isn't too hard to perform by reading the signal from the analog head electronics with a high-quality digital sampling oscilloscope, downloading the sampled waveform to a PC, and analysing it in software to recover the previously recorded signal. What the software does is generate an "ideal" read signal and subtract it from what was actually read, leaving as the difference the remnant of the previous signal. Since the analog circuitry in a commercial hard drive is nowhere near the quality of the circuitry in the oscilloscope used to sample the signal, the ability exists to recover a lot of extra information which isn't exploited by the hard drive electronics (although with newer channel coding techniques such as PRML (explained further on) which require extensive amounts of signal processing, the use of simple tools such as an oscilloscope to directly recover the data is no longer possible)

http://www.cs.auckland.ac.nz/~pgut001/pubs/secure_del.html

dotmad
+2  A: 

What we're looking at here is called "data remanence." In fact, most of the technologies that overwrite repeatedly are (harmlessly) doing more than what's actually necessary. There have been attempts to recover data from disks that have had data overwritten and with the exception of a few lab cases, there are really no examples of such a technique being successful.

When we talk about recovery methods, primarily you will see magnetic force microscopy as the silver bullet to get around a casual overwrite but even this has no recorded successes and can be quashed in any case by writing a good pattern of binary data across the region on your magnetic media (as opposed to simple 0000000000s).

Lastly, the 36 (actually 35) overwrites that you are referring to are recognized as dated and unnecessary today as the technique (known as the Gutmann method) was designed to accommodate the various - and usually unknown to the user - encoding methods used in technologies like RLL and MFM which you're not likely to run into anyhow. Even the US government guidelines state the one overwrite is sufficient to delete data, though for administrative purposes they do not consider this acceptable for "sanitization". The suggested reason for this disparity is that "bad" sectors can be marked bad by the disk hardware and not properly overwritten when the time comes to do the overwrite, therefore leaving the possibility open that visual inspection of the disk will be able to recover these regions.

In the end - writing with a 1010101010101010 or fairly random pattern is enough to erase data to the point that known techniques cannot recover it.

Bruce the Hoon
+2  A: 

"Data Remanence" There's a pretty good set of references regarding possible attacks and their actual feasibility on Wikipedia. There are DoD and NIST standards and recommendations cited there too. Bottom line, it's possible but becoming ever-harder to recover overwritten data from magnetic media. Nonetheless, some (US-government) standards still require at least multiple overwrites. Meanwhile, device internals continue to become more complex, and, even after overwriting, a drive or solid-state device may have copies in unexpected (think about bad block handling or flash wear leveling (see Peter Gutmann). So the truly worried still destroy drives.

Liudvikas Bukys
+1  A: 

The United States has requirements put out regarding the erasure of sensitive information (i.e. Top Secret info) is to destroy the drive. Basically the drives were put into a machine with a huge magnet and would also physically destroy the drive for disposal. This is because there is a possibility of reading information on a drive, even being overwritten many times.

Redbeard 0x0A
+2  A: 

I've always wondered why the possibility that the file was previously stored in a different physical location on the disk isn't considered.

For example, if a defrag has just occurred there could easily be a copy of the file that's easily recoverable somewhere else on the disk.

Sam Hasler
Biggest reason why this is not considered is because these tools (when properly used) overwrite entire disks. Still, with SSDs this is a real issue as they remap erase blocks without telling the host, for wear leveling.
MSalters
+3  A: 

Imagine a sector of data on the physical disk. Within this sector is a magnetic pattern (a strip) which encodes the bits of data stored in the sector. This pattern is written by a write head which is more or less stationary while the disk rotates beneath it. Now, in order for your hard drive to function properly as a data storage device each time a new magnetic pattern strip is written to a sector it has to reset the magnetic pattern in that sector enough to be readable later. However, it doesn't have to completely erase all evidence of the previous magnetic pattern, it just has to be good enough (and with the amount of error correction used today good enough doesn't have to be all that good). Consider that the write head will not always take the same track as the previous pass over a given sector (it could be skewed a little to the left or the right, it could pass over the sector at a slight angle one way or the other due to vibration, etc.)

What you get is a series of layers of magnetic patterns, with the strongest pattern corresponding to the last data write. With the right instrumentation it may be possible to read this layering of patterns with enough detail to be able to determine some of the data in older layers.

It helps that the data is digital, because once you have extracted the data for a given layer you can determine exactly the magnetic pattern that would have been used to write it to disk and subtract that from the readings (and then do so on the next layer, and the next).

Wedge
+1  A: 

Here's a Gutmann erasing implementation I put together. It uses the cryptographic random number generator to produce a strong block of random data.

private static void DeleteGutmann(string fileName)
{
    FileInfo fi = new FileInfo(fileName);

    if (fi.Exists)
    {
        const int gutmann_PASSES = 35;
        byte[][] gutmanns = new byte[gutmann_PASSES][];

        for (int i = 0; i < gutmanns.Length; i++)
        {
            gutmanns[i] = new byte[fi.Length];
        }

        RandomNumberGenerator rnd = new RNGCryptoServiceProvider();

        for (long i = 0; i < 4; i++)
        {
            rnd.GetBytes(gutmanns[i]);
            rnd.GetBytes(gutmanns[31 + i]);
        }

        for (long i = 0; i < fi.Length;)
        {
            gutmanns[4][i] = 0x55;
            gutmanns[5][i] = 0xAA;
            gutmanns[6][i] = 0x92;
            gutmanns[7][i] = 0x49;
            gutmanns[8][i] = 0x24;
            gutmanns[10][i] = 0x11;
            gutmanns[11][i] = 0x22;
            gutmanns[12][i] = 0x33;
            gutmanns[13][i] = 0x44;
            gutmanns[15][i] = 0x66;
            gutmanns[16][i] = 0x77;
            gutmanns[17][i] = 0x88;
            gutmanns[18][i] = 0x99;
            gutmanns[20][i] = 0xBB;
            gutmanns[21][i] = 0xCC;
            gutmanns[22][i] = 0xDD;
            gutmanns[23][i] = 0xEE;
            gutmanns[24][i] = 0xFF;
            gutmanns[28][i] = 0x6D;
            gutmanns[29][i] = 0xB6;
            gutmanns[30][i++] = 0xDB;
            if (i < fi.Length)
            {
                gutmanns[4][i] = 0x55;
                gutmanns[5][i] = 0xAA;
                gutmanns[6][i] = 0x49;
                gutmanns[7][i] = 0x24;
                gutmanns[8][i] = 0x92;
                gutmanns[10][i] = 0x11;
                gutmanns[11][i] = 0x22;
                gutmanns[12][i] = 0x33;
                gutmanns[13][i] = 0x44;
                gutmanns[15][i] = 0x66;
                gutmanns[16][i] = 0x77;
                gutmanns[17][i] = 0x88;
                gutmanns[18][i] = 0x99;
                gutmanns[20][i] = 0xBB;
                gutmanns[21][i] = 0xCC;
                gutmanns[22][i] = 0xDD;
                gutmanns[23][i] = 0xEE;
                gutmanns[24][i] = 0xFF;
                gutmanns[28][i] = 0xB6;
                gutmanns[29][i] = 0xDB;
                gutmanns[30][i++] = 0x6D;
                if (i < fi.Length)
                {
                    gutmanns[4][i] = 0x55;
                    gutmanns[5][i] = 0xAA;
                    gutmanns[6][i] = 0x24;
                    gutmanns[7][i] = 0x92;
                    gutmanns[8][i] = 0x49;
                    gutmanns[10][i] = 0x11;
                    gutmanns[11][i] = 0x22;
                    gutmanns[12][i] = 0x33;
                    gutmanns[13][i] = 0x44;
                    gutmanns[15][i] = 0x66;
                    gutmanns[16][i] = 0x77;
                    gutmanns[17][i] = 0x88;
                    gutmanns[18][i] = 0x99;
                    gutmanns[20][i] = 0xBB;
                    gutmanns[21][i] = 0xCC;
                    gutmanns[22][i] = 0xDD;
                    gutmanns[23][i] = 0xEE;
                    gutmanns[24][i] = 0xFF;
                    gutmanns[28][i] = 0xDB;
                    gutmanns[29][i] = 0x6D;
                    gutmanns[30][i++] = 0xB6;
                }
            }
        }

        gutmanns[14] = gutmanns[4];
        gutmanns[19] = gutmanns[5];
        gutmanns[25] = gutmanns[6];
        gutmanns[26] = gutmanns[7];
        gutmanns[27] = gutmanns[8];

        using (Stream s = new FileStream(
            fi.FullName,
            FileMode.Open,
            FileAccess.Write,
            FileShare.None,
            (int)fi.Length,
            FileOptions.DeleteOnClose | FileOptions.RandomAccess | FileOptions.WriteThrough))
        {
            for (long i = 0; i < gutmanns.Length; i++)
            {
                s.Seek(0, SeekOrigin.Begin);
                s.Write(gutmanns[i], 0, gutmanns[i].Length);
                s.Flush();
            }
        }
    }
}
Jesse C. Slicer
+1  A: 

See this: Guttman's paper

Will M
+4  A: 

Daniel Feenberg (an economist at the private National Bureau of Economic Research) claims that the chances of overwritten data being recovered from a modern hard drive amount to "urban legend":

Can Intelligence Agencies Read Overwritten Data?

So theoretically overwriting the file once with zeroes would be sufficent.

Ron Nelson
very interesting link
kjack
+2  A: 

The reason why you want this is not harddisks, but SSDs. They remap clusters without telling the OS or filesystem drivers. This is done for wear-leveling purposes. So, the chances are quite high that the 0 bit written goes to a different place than the previous 1. Removing the SSD controller and reading the raw flash chips is well within the reach of even corporate espionage. But with 36 full disk overwrites, the wear leveling will likely have cycled through all spare blocks a few times.

MSalters
Could you expand on this a little? I thought eraser just overwrote the file not the entire disk. 36 overwrites of the entire disk would take far longer than the time I've seen eraser take.
kjack
Hmm, seems the developer of "eraser" doesn't understand SSDs then. If you "erase" a single file on SSD 36 times, you'll likely end up with 36*sizeof(file) worth of zeroes, AND the original file in other blocks.
MSalters