tags:

views:

611

answers:

17

I've read a number of questions which suggest hashing passwords and storing them in the database.

When someone logs in, you hash the password provided with what you have stored.

What I do not understand is how this can work?? Two strings can hash to the same value - Not likely but DEFINITELY possible

Can someone help me please?

EDIT: Can anyone give a statistic of the collision likeliness?

+21  A: 

You should not store the password unencrypted because your database admins should not have access to customer passwords.

Hashing the passwords prevents database admins from being able to see the password.

The very small chance of a collision is not a problem as this does not significantly increase the chance of someone trying to brute force a customer's password getting a match before you lock-out the accounts.

PS it is also good practice to salt the passwords before hashing to make it more difficult still to brute force if a hacker somehow got access to the password table. Makes it harder to use rainbow tables etc.

Sam Meldrum
To minimize the risk of a brute-force attack on multiple passwords you should also salt them.
Joey
Passwords don't have much flavor; salt them always.
Will
Touch of pepper adds some bite too, I find.
Wim Hollebrandse
Use a unique sprinkling of salt too :)
Aim Kai
Just want to point out that a *hash* is a one way function, whereas *encryption* implies that someone with the right key could decrypt the original password. Hashing is best for password checks, you need encryption to send secure html for example.
Rob Fonseca-Ensor
+1 for mentioning the rainbow table attack
Zyphrax
With brute force attacks in mind, its also a good idea to timeout multiple failed attempts if they seem to be occurring in a manner that is indicative of an attack, IE multiple extremely fast log in attempts.
Bit Destroyer
+1  A: 

The hash function must be developed such that it is very unlikely to give the same hash for 2 different inputs, i.e. it is collision resistant. More information is available at wikipedia.

danio
+2  A: 

You can say "DEFINITELY possible" in the sense that it's provably possible, but with a good hashing algorithm this is exceedingly unlikely. There are plenty of articles available about choosing a hashing algorithm, and collision rate is a huge factor in that selection process.

The reason that people advocate hashing versus encrypting is because hashing is a one-way operation. The very possibility of a collision is what makes hashing a good choice from a user security perspective; because two values can produce the same hash, hashing can't be reversed. This makes it impossible (or nearly impossible) for someone to hack your database and gain access to login credentials for other sites (since users will frequently use the same password in your system as they do in [many] others).

Adam Robinson
+4  A: 

Even when two strings can hash to the same value (and they definitely do, because the space of possible values is much bigger than the space of hashes), it is still not so easy to find such string pairs (provided that you use a strong hash function).

Therefore if an attacker would want to login as somebody else without knowing his password, he would have to find a password which has the same hash which should be just as hard as finding the password (non-invertability of hash function is a basic property).

If you want to use hashing in .NET, try something like

    public static string ComputeHash(string plaintext)
    {
        HashAlgorithm mhash = new SHA1CryptoServiceProvider();
        byte[] bytValue = Encoding.UTF8.GetBytes(plaintext);
        byte[] bytHash = mhash.ComputeHash(bytValue);
        mhash.Clear();
        return Convert.ToBase64String(bytHash);
    }
Thomas Wanner
+9  A: 

Two strings can hash to the same value - Not likely but DEFINITELY possible

Yes, but if the hash is big enough and of good quality, it's unlikely enough not to worry about. In colloquial terms: every single user of the app getting hit by lightning simultaneously is not likey, but definitely possible. Do you worry about that?

Michael Borgwardt
oh great, now another thing to worry about. THANKS.
zaphod
you have to multiply risk by consequences. If a sysadmin (or hacker) has the same password as a user, they will see that their hashes and therefore passwords are the same, and will be able to log in as that user. If you're building a bank website, this becomes an issue.
Rob Fonseca-Ensor
@Rob: that's why you append random salt. Not only hashes of two identical passwords will be different, they will be harder to crack.
SF.
@SF Err yeah that's exactly what I was trying to advocate :$
Rob Fonseca-Ensor
A: 

With a proper hash algorithm the chances of a collision are extremely small. As long as you can consistently compute the hash, it doesn't matter if two strings happen to have the same hash (as long as it doesn't happen very often). You are just comparing the hash of the password that the user entered to the hash that you have stored in the database.

TLiebe
+1  A: 

A good implementation of this.

  private static string CreateSalt(int size)
  {
   //Generate a cryptographic random number.
   RNGCryptoServiceProvider rng = new RNGCryptoServiceProvider();
   byte[] buff = new byte[size];
   rng.GetBytes(buff);

   // Return a Base64 string representation of the random number.
   return Convert.ToBase64String(buff);
  }

private static string CreatePasswordHash(string pwd, string salt)
  {
   string saltAndPwd = String.Concat(pwd, salt);
   string hashedPwd =
    FormsAuthentication.HashPasswordForStoringInConfigFile(
    saltAndPwd, "sha1");

   return hashedPwd;
  }

Source: http://davidhayden.com/blog/dave/archive/2004/02/16/157.aspx

Mendy
A: 

Take for example the md5 hash string -- it has 128 bits, that means 2^128 different hashes can be generated. Even taking into account the Birthday Paradox, it is highly unlikely that a string which isn't the actual password can be found to generate the hash value (of course, md5 has been found insecure, but that's another issue, md5 is just an example).

So say you have a password, the program hashes it, and puts the hash in the database. When you try to log in, what you type is hashed and checked against that hash -- if it is a guess, there is a 1 in 2^128 chance that you are correct. That is infinitesimally small, and to actually hack the password this way (brute force) is 'infeasible'. Increase the number of bits, and it actually becomes impossible unless the hashing algorithm can be at least partially reverse engineered.

Vanwaril
+3  A: 

What you refer to is called "Collision vulnerability". But first a little background.

If you store an unencrypted password, it is a security hole, as you might have guessed and Sam suggests. However, you can use a proven algorithm, that reduces this possibility. You should definitely not try inventing your own (since you don't seem to be an Algorithm developer).

I usually use MD5, which is available in databases such as mysql. This way, you can embed the check in your database queries as well.

Now, MD5 unfortunately is not Collision resistant. However, the chances for collision are pretty slim. There might be other suggestions on message boards that you can look at. I know SHA-2 to be one such possibility. I do not know how easy it is to use in applications.

Tanmay
A collision vulnerability is a vulnerability in which two messages, one innocent, one hostile, are discovered that have the same hash. Can you explain how the collision vulnerability is relevant to the user's question about passwords? Because I am not following your train of thought here.
Eric Lippert
For me, the question translated to "can two different strings create the same hash". I recently found out that in MD5 its possible. Now, some other user pointed out that you don't need passwords to be unique. But in some domains (like my current one) you don't type user id so password should be unique. In that case, avoid hash functions with collision vulnerability. Practically though, the chances of collision are nill, and MD5 may still work 99.99 percent of times. Does this make more sense than the original answer?
Tanmay
A: 

Even though the collision chance may be slim ('slimness' depending on your hashing algorithm), it doesn't really matter, does it?

Even if you have the same password hash for user x and user y, you are never going to compare two users' passwords to determine they are the same based on their hash! So whilst 'technically' a collision hash, it doesn't really collide/interfere/matter in this scenario.

Wim Hollebrandse
+1  A: 

Yes, it's possible, but unlikely for a decent hash.

To give you an example: SHA1 is 160-bits, meaning to find two strings with the same hash, you'd need to hash about 2^80 different strings. Even with all our best super-computers, no one has ever found two strings that hash to the same SHA1 value (though I predict that it will happen soon).

BlueRaja - Danny Pflughoeft
+3  A: 

The fact that "MyAwesomePassword1" and "@*@#ngt0n8$!!~~~09||{2`=&-[kla2%!B*q" may hash to the same value is not a security vulnerability.

Jeffrey L Whitledge
A: 

If you're worried about collision, use SHA-2 with 512 bits. Then again, SHA-1 algorithm has yet to get a demonstrated collision, so it's still secure enough to not worry about collision.

David Brunelle
A: 

If you use a SHA-512 with a decent seed it's pretty unlikely (but not impossible) that this will happen.

Bruce Scheiner has some interesting blogs on the robustness of various security techniques - sha1 broken

Another consideration is to make sure that two users with the same password they don't end up with the same hashed result in your database.

To get round this you could construct a seed that includes something specific to the user like the userid key field from the user table.

Mattl
A: 

Why does it matter that multiple strings will hash to the same password value? Every time the user enters his or her password, it will hash to the same value, and that will authenticate the user. Entering almost any other password will almost certainly not match the hashed value. If you're using a good hash function that gives a 64-bit value, the odds that another password will match the user's is about quintillions to one.

What you are guarding against is insiders copying user passwords, and revealing actual passwords if security is breached and the password table compromised. The standard technique is to add some additional bits to the password (the salt, which should be different for each account), and use a strong hashing function.

If you do that, it's very difficult to find a password that will yield a given hash, except by trying password after password until you hit the right one. Hashing provides protection against insiders and intruders in this way: having access to the password table is insufficient to log in as somebody else.

It's possible that two users will have the same password, or (although this is extremely unlikely) passwords that hash to the same value. With the salt, they won't look alike in the password table, and so no intruder or insider will know that their password is the same as somebody else's by looking at the table.

Another benefit is that many people reuse passwords, particularly for accounts that don't need to be secure. (All of my accounts with financial information have different and good passwords, but I use the same not-so-strong password on assorted forums.) With salt and hash, it's not possible to tell if my password on site A is the same as on site B, even with both password tables.

David Thornley
A: 

Hashing and encrypting are slightly different concepts, even though md5() and sha1() and similar are essentially encryption functions performing hashing. Since hashing in theory should provide unique output for a given input (usually of larger size), it essentially means that every hash should map to a single input - meaning you have on your hands a perfect compression function which can reduce data of size N usually larger than M to a data of such size M, and the process is reversible. However, in practice hash collision takes place. This means that one hash may potentially map to more than one original input. This means a single password hash may fetch you a user with a different password altogether, a real security issue.

Encryption however, guarantees to be reversible (with the proper decryption key of course) where encrypted data has a one-to-one relationship with original data. If your password length is N, your encrypted password is also size N. Passwords are just as safe and no collision.

As they say, if something is too good to be true, it probably is - if hashing could be reversible to always produce original data, then we would not need data compression algorithms - we could take data of arbitrary size, hash it to produce some 128-bit long data and expect it to yield back the original data, if needed. This is not so however. The longer your original data, the more unreliable the hash, no matter its length.

Invent an encryption/decryption key pair and encrypt your users' passwords before storing them in public locations.

amn
A: 

Collisions do occur, but they should be rare enough that it is unlikely ever to occur within a certain amount of time.

Most attacks these days consider weaknesses in the algorithms used thus making it faster to find collisions to exploit. MD5 has been shown to be weaker than previously thought. This article from The Register shows how the weakness was used to create SSL certificates.

Even more resistant algorithms have been shown to have flaws that make it easier to find these collisions.

This PDF shows a paper discussing SHA-1 collisons and how it has been made easier. (Maths heavy)

For practical purposes for someone trying to recover passwords from hashes tools such as rainbow tables are employed

http://project-rainbowcrack.com/

This takes into account that people will choose passwords that are easy, so it's possible with a dictionary attack (a list of commonly used passwords plus many other terms) to compute hashes, and compare them. This is why using a salt is always recommended

petantik