views:

145

answers:

4

This is more of a cryptography theory question, but is it possible that the result of a hash algorithm will ever be the same value as the source? For example, say I have a string:

baf34551fecb48acc3da868eb85e1b6dac9de356

If I get the SHA1 hash on it, the result is:

4d2f72adbafddfe49a726990a1bcb8d34d3da162

In theory, is there ever a case where these two values would match? I'm not asking about SHA1 specifically here - it's just my example. I'm just wondering if hashing algorithms are built in such a way as to prevent this.

+7  A: 

Well, it would depend on the hashing algorithm - but I'd be surprised to see anything explicitly prevent this. After all, it really shouldn't matter.

I suspect it's very unlikely to happen, of course (for cryptographic hashes)... but even if it does, that shouldn't cause a problem.

For non-crypto hashes (used in hash tables etc) it would be perfectly reasonable to return the source value in some cases. For example, in Java, Integer.hashCode() just returns the embedded value.

Jon Skeet
it'd be funny if someone could find an example!
Evernoob
In .NET Int32.GetHashCode also returns the int value.
Meta-Knight
Worst case, the chances of a hash returning the input would be the same as two inputs returning the same hash - i.e. your chances of winning the lottery are much much higher.
Mark Ransom
Evernoob: get to it! Let me know how you're getting on after the first few trillion millenia!
bobince
+3  A: 

Sure, the Python hashing algorithm for integers returns the value of the integer. So hash(1) == 1.

Alex Gaynor
+4  A: 

Given a good hashing algorithm, one that returns a seemingly random output, I believe there should be on average one input that gives itself as the output. Let's say the hash can give N possible outputs. That means there are N possible inputs for which this is possible. For each of those, the odds of the output matching the input is 1/N, so there the expected number of fixed points is N*1/N, or 1.

Adam Crume
Very nicely derived.
Michael Mullany
+2  A: 

A hash function might be defined to avoid ‘fixed points’ where hash(x)==x, but your hash-quine differs a little in that you're taking the string representation in hex of the hash rather than the raw binary. It would, I think, be infeasible to design a hash that could frustrate that, and it's mathematically less interesting since it depends on the arbitrary mapping of 0-F to ASCII character codes.

See http://stackoverflow.com/questions/235785/is-there-an-md5-fixed-point-where-md5x-x for a discussion about fixed points in MD5. The probability calculation would be equally true for hex hash-quines and any other hash function with 128 bits of output.

bobince
I hadn't considered that I was hashing a string representation - in my question, I meant to refer to the hex value. I guess my question could really be phrased as "If I hash a password, is there ever a chance that the result is actually the password itself?"
rwmnau
Unless your password is exactly the length of the hash function's output, no. :)
Nick Johnson