views:

208

answers:

8

I'm building a web application where the front end is a highly-specialized search engine. Searching is handled at the main URL, and the user is passed off to a sub-directory when they click on a search result for a more detailed display. This hand-off is being done as a GET request with the primary key being passed in the query string. I seem to recall reading somewhere that exposing primary keys to the user was not a good idea, so I decided to implement reversible encryption.

I'm starting to wonder if I'm just being paranoid. The reversible encryption (base64) is probably easily broken by anybody who cares to try, makes the URLs very ugly, and also longer than they otherwise would be. Should I just drop the encryption and send my primary keys in the clear?

+1  A: 

Just send the primary keys. As long as your database operations are sealed off from the user interface, this is no problem.

Tor Valamo
+1  A: 

For your purposes (building a search engine) the security tradeoffs benefits of encrypting database primary keys is negligible. Base64 encoding isn't encryption - it's security through obscurity and won't even be a speedbump to an attacker.

pygorex1
+11  A: 

What you're doing is basically obfuscation. A reversible encrypted (and base64 doesn't really count as encryption) primary key is still a primary key.

What you were reading comes down to this: you generally don't want to have your primary keys have any kind of meaning outside the system. This is called a technical primary key rather than a natural primary key. That's why you might use an auto number field for Patient ID rather than SSN (which is called a natural primary key).

Technical primary keys are generally favoured over natural primary keys because things that seem constant do change and this can cause problems. Even countries can come into existence and cease to exist.

If you do have technical primary keys you don't want to make them de facto natural primary keys by giving them meaning they didn't otherwise have. I think it's fine to put a primary key in a URL but security is a separate topic. If someone can change that URL and get access to something they shouldn't have access to then it's a security problem and needs to be handled by authentication and authorization.

Some will argue they should never be seen by users. I don't think you need to go that far.

cletus
+1 I would upvote this more if I could. Hiding primary keys is about discouraging users from trying to extract meaning from them, not security. Technical keys are often called surrogate keys: http://en.wikipedia.org/wiki/Surrogate_key
Andy West
+1  A: 

If you're obscuring the primary keys for a security reason, don't do it. That's called security by obscurity and there is a better way. Having said that, there is at least one valid reason to obscure primary keys and that's to prevent someone from scraping all your content by simply examining a querystring in a URL and determining that they can simply increment an id value and pull down every record. A determined scraper may still be able to discover your means of obsuring and do this despite your best efforts, but at least you haven't made it easy.

Asaph
+1  A: 

If you're worried about someone altering the URL to try and look at other values, then perhaps you need to look at token generation.

For instance, instead of giving the user a 'SearchID' value, you give them a SearchToken, which is some long unique psuedo-random value (Read: GUID), which you then map to the SearchID internally.

Of course, you'll also need to apply session security and soforth still - because even a unique URL with a non-sequential ID isn't protected against sniffing by anything between your server and the user.

Will Hughes
+1  A: 

If you're trying to secure database query input just use parametrized queries. There's no reason at all to hide primary keys if they are manipulated by the public.

When you see base64 in the URL, you are pretty much guaranteed the developers of that site don't know what they are doing and the site is vulnerable.

Longpoke
+1  A: 

On the dangers of exposing your primary key, you'll want to read "autoincrement considered harmful", By Joshua Schachter.

URLs that include an identifier will let you down for three reasons.

The first is that given the URL for some object, you can figure out the URLs for objects that were created around it. This exposes the number of objects in your database to possible competitors or other people you might not want having this information (as famously demonstrated by the Allies guessing German tank production levels by looking at the serial numbers.)

Secondly, at some point some jerk will get the idea to write a shell script with a for-loop and try to fetch every single object from your system; this is definitely no fun.

Finally, in the case of users, it allows people to derive some sort of social hierarchy. Witness the frequent hijacking and/or hacking of high-prestige low-digit ICQ ids.

John Wiseman
A: 

URLs that include an identifier will let you down for three reasons.

Wrong, wrong, wrong.

First - every request has to be validated, regardless of it coming in the form of a HTTP GET with an id, or a POST, or a web service call.

Second - a properly made web-site needs protection against bots which relies on IP address tracking and request frequency analysis; hiding ids might stop some people from writing a shell script to get a sequence of objects, but there are other ways to exploit a web site by using a bruteforce attack of some sort.

Third - ICQ ids are valuable but only because they're related to users and are a user's primary means of identification; it's a one-of-a-kind approach to user authentication, not used by any other service, program or web-site.

So, to conclude.. Yes, you need to worry about scrapers and DDOS attacks and data protection and a whole bunch of other stuff, but hiding ids will not properly solve any of those problems.

Mr. TA