tags:

views:

438

answers:

2

I'm working on a multi-tenant application that will be implementing service APIs. I don't want to expose the default auto increment key for security reasons and data migration/replication concerns so I'm looking at alternative keys. GUID/UUID is an obvious choice but they make the URL a bit long and while reading an article about them I saw that Google uses "truncated SHA1" for their URL IDs.

How does this work? It's my understanding that you hash part/all of the object contents to come up with the key. My objects can change over time so hashing the whole object wouldn't work since the key will need to remain the same over time. Could I implement UUIDs and hash those? What limitations/issues are there in using SHA1 for keys (e.g. max records, collision, etc.)?

I've been searching Google but haven't come up with the right search query.

/* edit: more information about environment */
Currently we are a Java shop using Spring/Hibernate with MySQL in back. We are in process to switch core development to Grails which is where this idea will be implemented.

+1  A: 

That's actually a pretty solid idea, though it might make key lookups a little tough (unless you hashed the key and kept it inline in the table, I suppose). You'd just have to hash every key you use, though if you're auto-incrementing, that's no problem. You wouldn't even need a GUID - you could even just hash the key, since it's a one-way operation and can't be easily reversed. You could even "salt" your key before you hash it, which would make it virtually unbreakable by making the key unpredictable.

There is a concern about collision, but with SHA1, your hash is 160 bits, or has 1.46 × 10^48 unique values, which should be enough to support some fraction of that many unique keys without worrying about a collision. If you've got enough keys that you're still worried about a collision, you can upgrade to something like SHA256 or even SHA512, which should be plenty long as to avoid any reasonable concern about a collision.

If you need some hashing code, post the language you're using and I can find some, though there's plenty available online if you know what you're looking for.

rwmnau
+2  A: 

I thought about a similar problem some time ago and ended up implementing Blowfish in the URL. It's not super safe but gives much shorter URLs than for instance SHA256 and also it's completely collision free.

Jonas Elfström