tags:

views:

124

answers:

5

I want to generate a hash code for a file. Using C# I would do something like this then store the value in a database.

byte[] b = File.ReadAllBytes(@"C:\image.jpg");
string hash = ComputeHash(b);

Now, if i use say a Java program that implements the same hashing alogorithm (Md5), can i expect the hash values to be the equal to the value generated in C#? What if i execute the java program from different environments, Windows, Linux or Mac?

+1  A: 

Havh values generated from the same input and with the same algorithm are defined to be equal. 1+1=2, regardless of the programming language I program this in.

Otherwise the internet would not work at all, you know.

TomTom
+1  A: 

My suggestion would be to use a common/accepted hashing algorithm like MD5 to achieve the same hash values.

Ritesh M Nayak
+7  A: 

Hash values are not globally unique. But that is not what you are really asking.

What you really want to know is whether a hashing algorithm (such as MD5) will produce the same hash value for identical files on different operating system platforms. The answer to that is "yes" ... provided that files are byte-for-byte identical.

In the case of an binary format that should be the case. In the case of text files, transcoding between different character encodings, or changing line termination sequences will make the files different at the byte level and result in different MD5 hash values.

Stephen C
+1  A: 

If the Hashing algorithm and the input are same, the hash value generated would be same irrespective of language or environment. The hashing algorithm takes the full/part of the key and manipulates it to generate the value which is why it would be same in all languages.

Vaishak Suresh
A: 

I wish I could comment on this but I don't have enough reputation to do that.

While I don't know for what purpose you want to use a hash algorithm, I'd like to say that some collisions have been found for MD5 so it might be less "secure" (well, we probably can't say "broken" since those collisions are hard to compute). The same remark applies to the SHA-1 algorithm.

More information here: http://www.mathstat.dal.ca/~selinger/md5collision/

So if you want to use a hash algorithm for security purposes, you might take a look at SHA-256 or SHA-512 which are stronger for now.

Otherwise you can probably keep going with MD5.

My two cents.

ereOn