I'm working on SHA1 checksum hashing 15,000 images (40KB - 1.0MB each, approximately 1.8GB total). I'd like to speed this up as it is going to be a key operation in my program and right now it is taking between 500-600 seconds.
I've tried the following which took 500 seconds:
public string GetChecksum(string filePath)
{
FileStream fs = new FileStream(filePath, FileMode.Open);
using (SHA1Managed sha1 = new SHA1Managed())
{
return BitConverter.ToString(sha1.ComputeHash(fs));
}
}
Then I thought maybe the chunks SHA1Managed() was reading in were too small so I used a BufferedReader and increased the buffer size to greater than the size of any of the files I'm reading in.
public string GetChecksum(string filePath)
{
using (var bs = new BufferedStream(File.OpenRead(filePath), 1200000))
{
using (SHA1Managed sha1 = new SHA1Managed())
{
return BitConverter.ToString(sha1.ComputeHash(bs));
}
}
}
This actually took 600 seconds.
Is there anything I can do to speed up these IO operations, or am I stuck with what I got?
As per x0n's suggestion I tried just reading in each file into a byte array and discarding the result. It appears I'm IO bound as this took ~480 seconds in itself.