I'm trying to optimize my ASP.NET thumbnailing script, so it doesn't resize all the images all the time, and one part of the problem is choosing the hash function for the thumbnail naming/checking procedure. Is crc32 up to the task - I'm asking cause the input data is small(only relative path, size and date)?
A:
You have many choices in hashing that you can use.
What Lasse V. Karlsen commented, that if your hasing the filenames string.GetHashCode() would be good enough if most cases. If your actualy hashing the content of the files your choices range from CRC32,MD5,SH1,SH256-infinte.
If your hashing the files i would guess MD5/SHA1 is going to be good enough. If i where you i would build a test case ( Perhaps with a virutal machine) that runs the lowest hardware your app should support and try MD5 / SHA1. See if that speed is good enough for you and check for Hashing collision ( So have as many pictures as posible in your test case).
Found a good article here with many hashing functions
EKS
2010-02-09 14:50:25
GetHashCode seems interesting(didn't knew it existed), but it seems that its hashspan is only integer wide and that would leave me at ~4B combinations - there's got to be a collision or two in such a small space? Don't know how many permutations are there in the string type?Otherwise it fits the bill just right.
vani
2010-02-10 11:43:39
What i normaly do with GetHashCode() is i first compare the int, and then compare the string. That way im 100 sure its a true match and i also get the high speed of comparing ints
EKS
2010-02-10 12:53:26