tags:

views:

175

answers:

3

Does anyone have any info on changing the hash of a file without corrupting it?

I read about appending a null byte to the end of the file, thus changing the MD5 without corrupting it. Anyone have any info?

The language I wish to do this in is PHP.

Thanks.

+6  A: 

It depends on exactly what the applications expect when they read this file. If, for example, it's a text file, you could simply insert a space following one of the paragraphs. This doesn't change the readability of the file by humans but it will change the MD5.

Likewise for basic HTML files or source files such as C or PHP where the spacing doesn't matter (as long as you insert the space in a syntactically insignificant area, so not inside string constants for example) . Put in some extra spaces or add newline characters at the end and you'll find the behavior of your web pages doesn't change.

However this is unlikely to work for an executable file since it will probably crash and burn when you run it (if indeed it even loads - some loaders may use checksums for the load sections).

You need to specify exactly what corruption means in the case you're talking about.

Update:

For example, in JPEG files, it's probably a simple matter of replacing the EOI marker at the end with a unique COM section followed by an EOI marker. The EOI marker is the end of image and you should be able to insert the comment section (with a unique comment) before it. This would make each JPEG have a different MD5 while stil presenting the same image. See here.

With ZIP files, you can actually insert arbitrary data in between each file since the catalog at the end lists files with their offsets. See here for details. Unfortunately, I'm not familiar with the internals of RAR files.

paxdiablo
Sorry, most of the files are .ZIP and .RAR, with some .jpegs.
Joseph
+3  A: 

Sounds like you might be better off just changing those duplicate files to symbolic links ln -s otherfolder/file file (assuming the server is on a *nix platform).

Mike Nelson
Now that's a good answer!
Alix Axel
+1  A: 

If you are primarily dealing with .ZIP and .RAR files, find a ZIP/RAR library for PHP, and simply add a tiny random file to every zip/rar.

For JPEGs, follow paxdiablo's answer.

Aequitarum Custos