ansaurus

Question

SQL SHA1 inside WHERE

Answer 1

A:

Did you compare the output of your hash algorithm with the output of MySQL's SHA1()? For example for IP address 1.2.3.4?

Andomar 2009-05-23 18:38:29

Answer 2

+2 A:

Every time I've had an unexpected hashing mismatch, it was because I accidentally hashed a string that included some whitespace, such as "\n".

Bill Karwin 2009-05-23 18:39:24

+1 Right... I remember that

Andomar 2009-05-23 18:40:38

Answer 3

+3 A:

I'd store the SHA1 of the IP in the database along with the raw IP, so that the query would become

SELECT * FROM records WHERE ip_sha1 = "..."

Then I'd make sure that the SHA1 calculation happens exactly one place in code, so that there's no opportunity for it be be done slightly differently in multiple places. That also gives you the opportunity to mix a salt into the calculation, so that someone can't simply compute the SHA1 on an IP address they're interested in and pass that in by hand.

Storing the SHA1 hash the database also gives you the opportunity to add a secondary index on ip_sha1 to speed up that SELECT. If you have a very large data set, doing the SHA1 in the WHERE clauses forces the database to do a complete table scan, along with redoing a calculation for every record on every scan.

Dave W. Smith 2009-05-23 18:52:38

+1: DRY (Don't Repeat Yourself) is a *key* principle, and ensuring that each key piece of data or code exists only once, i.e. in "exactly one place" as you put it, a crucial part of DRY.

Alex Martelli 2009-05-23 18:55:32

I need the raw IP address myself, but don't want to share it with the public.

Isaac Waller 2009-05-23 19:54:06

@Isaac Waller: You can store both the IP and its hash in the table. That would make searches faster, and troubleshooting easier.

Andomar 2009-05-23 20:00:59

@Andomar That's what I meant, but I see it's not what I wrote. Thanks for the catch.

Dave W. Smith 2009-05-23 20:07:36

Answer 4

+3 A:

Don't know if it matters, but your SHA1 hash da39a3ee5e6b4b0d3255bfef95601890afd80709 is a well-known hash of an empty string.

Is it just an example or you forgot to provide an actual IP address to the hash calculation function?

Update:

Does your webpage code generate SHA1 hashes in lowercase?

This check will fail in MySQL:

SELECT  SHA1('') = 'DA39A3EE5E6B4B0D3255BFEF95601890AFD80709'

In this case, use this:

SELECT  SHA1('') = LOWER('DA39A3EE5E6B4B0D3255BFEF95601890AFD80709')

, which will succeed.

Also, you can precalculate the SHA1 hash when you insert the records into the table:

INSERT
INTO    ip_records (ip, ip_sha)
VALUES  (@ip, SHA1(CONCAT('my_secret_salt', @ip))

SELECT  *
FROM    ip_records
WHERE   ip_sha = @my_salted_sha1_from_webpage

This will return you the original IP and allow indexing of ip_sha, so that this query will work fast.

Quassnoi 2009-05-23 19:12:26

That was just a example.

Isaac Waller 2009-05-23 19:50:52

Answer 5

+1 A:

Just a quick thought: that's a very simple obfuscation. There are only 2³² possible IP addresses, so if somebody with technical knowledge wanted to figure it out, they could do that by calculating all 4 billion hashes, which wouldn't take very long. Depending on the sensitivity of those ip addresses, you may want to consider a private lookup table.

Autocracy 2009-05-23 19:42:07

Answer 6

A:

I ended up encrypting the IP addresses, and decrypting them on the other page. Then I can just use the raw IP address in the SQL query. Also, it protects against brute force attacks, like Autocracy said.

Isaac Waller 2009-05-23 19:56:46

ansaurus

tags:

views:

answers:

SQL SHA1 inside WHERE

related questions