views:

184

answers:

8

To save a file i defined the following method

public int encrypt(String fileName, String password) {
   return (fileName.concat(password)).hashCode();
}

This returns a hashvalue that is stored in a file. Whenever the user wants to access the file, he enters the password, and if the same hash is generated, he can access the file.

I suppose this isn't really safe, but how safe it is? How high is the chance that String#hashCode generates the same hash with two different inputs?

EDIT:

According to your answers I changed the code:

public String encrypt(String password) {
        String hash = "";
        try {
            MessageDigest md5 = MessageDigest.getInstance("SHA-512");
            byte [] digest = md5.digest(password.getBytes("UTF-8"));
            hash = Arrays.toString(digest);
        } catch (UnsupportedEncodingException e) {
            e.printStackTrace();
        } catch (NoSuchAlgorithmException e) {
            e.printStackTrace();
        }
        return hash;
    }

So it should be better now??

+2  A: 

Honestly I don't know how collision resistant Java's hashCode() is. If I were to guess, I'd say not very. I tested it before and found a couple of collisions after a just few hundred thousand inputs.

Since you are dealing with passwords here, you should really use a cryptographic hash like SHA1.

NullUserException
I was messing around the other night with permutations and found hashcode() collisions in 9! entries.
zmf
Yeah, 9! = 362880
NullUserException
+5  A: 

It is usually a bad idea to rely on non-cryptographic functions for serving security relevant purposes. As you can never be sure which implementation is used (and will be used in the future) to calculate a string's hash code you should prefer a cryptographic secure hash code algorithm. I would advise using SHA-1 or SHA-256. http://www.bouncycastle.org/ has implementations for many hash algorithms.

tob
+2  A: 

I'd be concerned that this code is non-portable. There is no guarantee that one JVM will produce the same hash value as another JVM. That seems very risky.

Kirk Woll
The hash algorithm is [well-specified.](http://download.oracle.com/javase/6/docs/api/java/lang/String.html#hashCode%28%29) It *is* portable across JVMs.
erickson
Ah, yes, forgot there was a special version defined for strings. Good catch.
Kirk Woll
+4  A: 

String.hashCode is not suitable for hashing passwords. You need a cryptographic hash instead.

String.hashCode is designed to be very fast to compute. Its primary use is for a key in a hash table. For this use, an occasional collision is not an issue. Cryptographic hashes are slower to compute, but by definition no one knows how to generate collisions for a good cryptographic.

More importantly, given the value of password.hashCode(), it's possible to find password (with high confidence, though not with certainty since many passwords have the same hash). That's not something you ever want to happen. Cryptographic hashes, on the other hand, are designed so that it's impossible to find the password knowing the hash (mathematically speaking, no one knows how to find the password from the hash in their lifetime).

Cryptographic hashes are available in the Java standard library, through java.security.MessageDigest.

ADDED: There's one more complication: it's a bad idea to directly hash of a password. The reason is that an attacker could try all likely passwords (e.g. dictionary words, people's names, etc.) The standard solution to this problem is to concatenate the password with a random string called a salt before computing the hash: you do someting like sha.digest((salt+password).getBytes()). The salt makes it impratical for the attacker to precompute all hashes of likely passwords.

Usually the salt randomly generated when the user chooses his/her password, and it is stored next to the password hash in a user database, but from what you show of your scheme there is no such thing. Given your design, it would be reasonable to use the filename as the salt: fileName.concat(encrypt(fileName + password)).

Gilles
You couldn't find the password for certain (after all, there will be many strings all hashing to the same value) - but you could find a string which gives the same hash.
Jon Skeet
@Jon: If you assume that the password is composed of printable characters and has a length in a narrow range, there's a good chance you can actually find it. And that's the real prize, because users reuse their passwords.
Gilles
@Gilles: Yes, there's a reasonable chance. It's a lot better with your edit :)
Jon Skeet
+16  A: 

This is a bad idea - you should use a normal cryptographic hash such as SHA-1, just as NullUserException says.

However, it would be portable - the docs for String.hashCode() state the algorithm explicitly. Any JRE implementing the docs properly should give the same hash code. However, because of the way hashCode()'s algorithm works, it's pretty easy to find a string which will generate any particular hash code - even one starting with a specific prefix - so an attacker who knew the hash could attack your application very easily. Cryptographic hashes are designed to make it hard to engineer a key to match a particular hash.

Jon Skeet
+1  A: 

Here is String.hashCode()'s implementation:

s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]

Publicly available here...

This is actually VM-independant, and it has not been Java Version-dependant in the past. The implementation has stayed the same.

Collision safety is OK IMHO, however it is a bad idea to use it for cryptographic purposes for obvious reasons.

BenoitParis
+2  A: 

It's not as hard as you might think to hash data, and it's better to use a real hashing algorithm. If you have a byte array that contains the password, you can just do something like this. If you're getting the byte array from a string, make sure to specify the encoding (i.e. UTF-8) when calling getBytes();

Here's a simple example using MD5.

    try {
        MessageDigest md5 = MessageDigest.getInstance( "MD5" );

        byte [] digest = md5.digest( data );

        return digest;
    } catch( java.security.NoSuchAlgorithmException ex ) {
        // Insert error handling here.
    }
Shawn D.
thanks. is there any possibility that i don't have to return a byte array?
Roflcoptr
Well, you can return whatever you want. The return value from the digest is a byte array. You can hex or base64 encode it if you'd prefer to return a string.
Shawn D.
`return new String(digest, Charsets.UTF_8);` (requires guava for the Charsets constants)
Willi
A: 

the #1 problem is the hash is only 32 bits. that's way too short. a kid with BASIC can break it in a second.

md5 is 128 bits long, and it's considered weak now.

irreputable
MD5 is not considered weak because of its length, but because they found a way go generate collisions.
NullUserException
plausible collision because of its length.
irreputable