views:

942

answers:

6

Hello,
In my program, we store a user's IP address in a record. When we display a list of records to a user, we don't want to give away the other user's IP, so we SHA1 hash it. Then, when the user clicks on a record, it goes to a URL like this:

http://www.example.com/allrecordsbyipaddress.php?ipaddress=SHA1HASHOFTHEIPADDRESS

Now, I need to list all the records by the IP address specified in the SHA1 hash. I tried this:

SELECT * FROM records 
WHERE SHA1(IPADDRESS)="da39a3ee5e6b4b0d3255bfef95601890afd80709"

but this does not work. How would I do this?
Thanks, Isaac Waller

A: 

Did you compare the output of your hash algorithm with the output of MySQL's SHA1()? For example for IP address 1.2.3.4?

Andomar
+2  A: 

Every time I've had an unexpected hashing mismatch, it was because I accidentally hashed a string that included some whitespace, such as "\n".

Bill Karwin
+1 Right... I remember that
Andomar
+3  A: 

I'd store the SHA1 of the IP in the database along with the raw IP, so that the query would become

SELECT * FROM records WHERE ip_sha1 = "..."

Then I'd make sure that the SHA1 calculation happens exactly one place in code, so that there's no opportunity for it be be done slightly differently in multiple places. That also gives you the opportunity to mix a salt into the calculation, so that someone can't simply compute the SHA1 on an IP address they're interested in and pass that in by hand.

Storing the SHA1 hash the database also gives you the opportunity to add a secondary index on ip_sha1 to speed up that SELECT. If you have a very large data set, doing the SHA1 in the WHERE clauses forces the database to do a complete table scan, along with redoing a calculation for every record on every scan.

Dave W. Smith
+1: DRY (Don't Repeat Yourself) is a *key* principle, and ensuring that each key piece of data or code exists only once, i.e. in "exactly one place" as you put it, a crucial part of DRY.
Alex Martelli
I need the raw IP address myself, but don't want to share it with the public.
Isaac Waller
@Isaac Waller: You can store both the IP and its hash in the table. That would make searches faster, and troubleshooting easier.
Andomar
@Andomar That's what I meant, but I see it's not what I wrote. Thanks for the catch.
Dave W. Smith
+3  A: 

Don't know if it matters, but your SHA1 hash da39a3ee5e6b4b0d3255bfef95601890afd80709 is a well-known hash of an empty string.

Is it just an example or you forgot to provide an actual IP address to the hash calculation function?

Update:

Does your webpage code generate SHA1 hashes in lowercase?

This check will fail in MySQL:

SELECT  SHA1('') = 'DA39A3EE5E6B4B0D3255BFEF95601890AFD80709'

In this case, use this:

SELECT  SHA1('') = LOWER('DA39A3EE5E6B4B0D3255BFEF95601890AFD80709')

, which will succeed.

Also, you can precalculate the SHA1 hash when you insert the records into the table:

INSERT
INTO    ip_records (ip, ip_sha)
VALUES  (@ip, SHA1(CONCAT('my_secret_salt', @ip))

SELECT  *
FROM    ip_records
WHERE   ip_sha = @my_salted_sha1_from_webpage

This will return you the original IP and allow indexing of ip_sha, so that this query will work fast.

Quassnoi
That was just a example.
Isaac Waller
+1  A: 

Just a quick thought: that's a very simple obfuscation. There are only 232 possible IP addresses, so if somebody with technical knowledge wanted to figure it out, they could do that by calculating all 4 billion hashes, which wouldn't take very long. Depending on the sensitivity of those ip addresses, you may want to consider a private lookup table.

Autocracy
A: 

I ended up encrypting the IP addresses, and decrypting them on the other page. Then I can just use the raw IP address in the SQL query. Also, it protects against brute force attacks, like Autocracy said.

Isaac Waller