Possible Duplicate:
Need some help understanding password salt
Update: Please note I am not asking what a salt is, what a rainbow table is, what a dictionary attack is, or what the purpose of a salt is. I am querying: If you know the users salt and hash, isn't it quite easy to calculate their password?
I understand the process, and implement it myself in some of my projects.
s = random salt
storedPassword = sha1(password + s)
In the database you store:
username | hashed_password | salt
Every implementation of salting I have seen adds the salt either at the end of the password, or beginning:
hashed_Password = sha1(s + password )
hashed_Password = sha1(password + s)
Therfore, a dictionary attack from a hacker who is worth his salt (ha ha) would simply run each keyword against the stored salts in the common combinations listed above.
Surely the implementation described above simply adds another step for the hacker, without actually solving the underlying issue? What alternatives are there to step around this issue, or am I misunderstanding the problem?
The only thing I can think to do is have a secret blending algorithm that laces the salt and password together in a random pattern, or adds other user fields to the hashing process meaning the hacker would have to have access to the database AND code to lace them for a dictionary attack to prove fruitful. (Update, as pointed out in comments it's best to assume the hacker has access to all your information so this probably isn't best).
Let me give an example of how I propose a hacker would hack a user database with a list of passwords and hashes:
Data from our hacked database:
RawPassword (not stored) | Hashed | Salt
--------------------------------------------------------
letmein WEFLS... WEFOJFOFO...
Common password dictionary:
Common Password
--------------
letmein
12345
...
For each user record, loop the common passwords and hash them:
for each user in hacked_DB
salt = users_salt
hashed_pw = users_hashed_password
for each common_password
testhash = sha1(common_password + salt)
if testhash = hashed_pw then
//Match! Users password = common_password
//Lets visit the webpage and login now.
end if
next
next
I hope this illustrates my point a lot better.
Given 10,000 common passwords, and 10,000 user records, we would need to calculate 100,000,000 hashes to discover as many user passwords as possible. It might take a few hours, but it's not really an issue.
Update on Cracking Theory
We will assume we are a corrupt webhost, that has access to a database of SHA1 hashes and salts, along with your algorithm to blend them. The database has 10,000 user records.
This site claims to be able to calculate 2,300,000,000 SHA1 hashes per second using the GPU. (In real world situation probably will be slower, but for now we will use that quoted figure).
(((95^4)/2300000000)/2)*10000 = 177 seconds
Given a full range of 95 printable ASCII characters, with a maximum length of 4 characters, divided by the rate of calculation (variable), divided by 2 (assuming the average time to discover password will on average require 50% of permutations) for 10,000 users it would take 177 seconds to work out all users passwords where the length is <= 4.
Let's adjust it a bit for realism.
(((36^7)/1000000000)/2)*10000 = 2 days
Assuming non case sensitivity, with a password length <= 7, only alphanumeric chars, it would take 4 days to solve for 10,000 user records, and I've halved the speed of the algorithm to reflect overhead and non ideal circumstance.
It is important to recognise that this is a linear brute force attack, all calculations are independant of one another, therfore it's a perfect task for multiple systems to solve. (IE easy to set up 2 computers running attack from different ends that would half the exectution time).
Given the case of recursively hashing a password 1,000 times to make this task more computationally expensive:
(((36^7) / 1 000 000 000) / 2) * 1000 seconds = 10.8839117 hours
This represents a maximum length of 7 alpha-numeric characters, at a less than half speed execution from quoted figure for one user.
Recursively hashing 1,000 times effectively blocks a blanket attack, but targetted attacks on user data are still vulnerable.