views:

397

answers:

6

I often come across web applications that expose internal database primary keys through forms like select boxes. And occasionally I see javascript matching against an int or guid magic value that switches the logic.

Is is a best practice to avoid leaking all internal identifiers of rows in your web application to prevent outsiders from understanding too much of your system and possibly using it to exploit your system. If so what is the best way to solve this problem?

Should you expose some other value to the web app that can be translated back to the primary key?

Thanks

Edit

In a perfect world your application would be 100% secure so it wouldn't matter if you obscured things. Obviously that's not the case so should we error on the side of caution and not expose this information?

Some have pointed out that Stackoverflow is probably exposing a key in the Url which is probably fine. However are considerations different for Enterprise Applications?

+2  A: 

Yes to all your questions. I agree with your assertions that you should "expose some other value to the web app that can be translated back to the primary key"

You can open yourself up to potential problems otherwise.

Edit

Regarding the comment that "there is no reason to take the penalty hit for trivial keys. Look in your browser's URL right now, I bet you see a key!".

I understand what you're saying and, yes, I do see the key in the SO URL and agree it probably does refer to a database PK. I concede in instances like this it's probably OK if there's not a better alternative.

I'd still prefer to expose something other than a PK - something with semantic value. The title of the question is also in the URL, but since this would be hard to verify as unique (or pass between users verbally) it can't be used reliably on it's own.

When looking at the "tag" URLs however (i.e. http://stackoverflow.com/questions/tagged/java+j2ee), the PKs are not exposed and the tag names are used instead. Personally, I prefer that approach and would strive for that.

I also wanted to add that the data a PK points at can sometimes change with time. I've worked on a system where a table was filled with info from an offline process - i.e. monthly statistics where the DB table contents dropped at the end of the month and was repopulated with new data.

If the data is added to the db in a different order, the PKs for specific data points would change, and any saved bookmarks from a previous month to that data would now point at a different set of data.

This is one instance where exposing a PK would "break" an app unrelated to the security of the app. Not so with a generated key.

Vinnie
It's not an absolute, and there is no reason to take the penalty hit for trivial keys. Look in your browser's URL right now, I bet you see a key!
Chad Grant
+1  A: 

Yes, exposing keys is information that can be used as an attack. Especially if they are predictable.

Use a different key/column if you think the information is sensitive.

For example to avoid showing how may users you have, consider:

site.com/user/123 vs site.com/user/username

Chad Grant
+12  A: 

I disagree with the stance that exposing primary keys is a problem. It can be a problem if you make them visible to users because them they are given meaning outside the system, which is usually what you're trying to avoid.

However to use IDs as the value for combo box list items? Go for it I say. What's the point in doing a translation to and from some intermediate value? You may not have a unique key to use. Such a translation introduces more potential for bugs.

Just don't neglect security.

If say you present the user with 6 items (ID 1 to 6), never assume you'll only get those values back from the user. Someone could try and breach security by sending back ID 7 so you still have to verify that what you get back is allowed.

But avoiding that entirely? No way. No need.

As a comment on another answer says, look at the URL here. That includes what no doubt is the primary key for the question in the SO database. It's entirely fine to expose keys for technical uses.

Also, if you do use some surrogate value instead, that's not necessarily more secure.

cletus
+1 I agree. Using some arbitrary values to avoid showing the ID of something is security-through-obfuscation at best (read: not great). Your security should be strong enough to reject users faking values.
nickf
A: 

As always, "it depends". If someone could gain value (say) by knowing how many transactions you're performing per (hour/day/month), and you're exposing a transaction ID as a monotonically increasing number, then that's a risk.

As others have said though, for a drop-down list of values, usually no problem.

Damien_The_Unbeliever
A: 

Exposing other values than the primary key will not avoid you the burden of checking you security. Indeed, if your security has holes, "evil" users might still access objects which they are not intended to by changing the value in the url. They might have less clues of which values to use, but randomly picking values, they might get lucky.

If you want to improve security this way, you will have to use big random strings (to make guessing difficult) in the url, instead of the id and use an indirection table matching the random value with the good id in the background.

I think it is not worth the hassle most of the time.

That said, it is useful for cases where you want "security by obscurity", like for example when you expose pages for changing user passwords, requiring no login (for users having lost their password). In this kind of case, you should also associate a limit validity date to your key.

madewulf
A: 

A primary key is not the same thing as a surrogate key or an internal key, although there is some overlap. The opposite of an internal key is a natural key. There are many times when a natural key is used as a primary key. There is, in general, no reason to hide a natural key, unless we are dealing with privacy issues.

Your real question is, I think, about whether internal keys should be exposed to the users. The answer is, "it depends". For most user communities, exposing the internal key will result in the key becoming used as a natural key. This can, and usually does, result in some confusion when the one-to-one mapping between the internal key and the subject of the table row breaks down.

This breakdown can only occur as a result of data mismanagement. Most of the time, however, you have to plan on some data mismanagment occurring in the real world. You wouldn't plan a water supply system that breaks down whenever there's water mismanagment. You don't plan an information system that breaks down each time there's data mismangement.

Having said that, most database designers these days work for software vendors, and don't see the problems caused by data mismanagment by the users of their products.

Walter Mitty