views:

290

answers:

9

I have read that using database keys in a URL is a bad thing to do.

For instance,

My table has 3 fields: ID:int, Title:nvarchar(5), Description:Text

I want to create a page that displays a record. Something like ... http://server/viewitem.aspx?id=1234

1) First off, could someone elaborate on why this is a bad thing to do?

2) and secondly, what are some ways to work around using primary keys in a url?

A: 

It's a bit pedantic at times, but you want to use a unique business identifier for things rather than the surrogate key.

It can be as simple as ItemNumber instead of Id.

The Id is a db concern, not a business/user concern.

aaronjensen
+4  A: 

I think it's perfectly reasonable to use primary keys in the URL.

Some considerations, however:

1) Avoid SQL injection attacks. If you just blindly accept the value of the id URL parameter and pass it into the DB, you are at risk. Make sure you sanitise the input so that it matches whatever format of key you have (e.g. strip any non-numeric characters).

2) SEO. It helps if your URL contains some context about the item (e.g. "big fluffy rabbit" rather than 1234). This helps search engines see that your page is relevant. It can also be useful for your users (I can tell from my browser history which record is which without having to remember a number).

mopoke
2) is not really true. It's an urban legend. I have a few websites and I tried both with and without and the pages get the exactly same PR whether they are titled 1.php or eat_the_dinner.php. No effect whatsoever. The thing that matters is the title, first H1 tag on the page and links from others.
Milan Babuškov
This reply, while accepted, hasn't covered the security aspects. The obvious one being; you must ensure that whoever is viewing this has access to view it; but slightly less-done, is that you should be mindful that it will be obvious to people you are using sequential numbering. This may be bad, depending on what do you, and how much you display.
Noon Silk
+2  A: 

It's not inherently a bad thing to do, but it has some caveats.

Caveat one is that someone can type in different keys and maybe pull up data you didn't want / expect them to get at. You can reduce the chance that this is successful by increasing your key space (for example making ids random 64 bit numbers).

Caveat two is that if you're running a public service and you have competitors they may be able to extract business information from your keys if they are monotonic. Example: create a post today, create a post in a week, compare Ids and you have extracted the rate at which posts are being made.

Caveat three is that it's prone to SQL injection attacks. But you'd never make those mistakes, right?

Don Neufeld
Security through obscurity does not work. If you are worried that a user might gain access to sensitive data just by visiting a certain url, you need to add actual protection to your data - require authorization, and verify that the authorized user has privileges to view the given resource.
Aeon
A: 
  1. Using integer primary keys in a URL is a security risk. It is quite easy for someone to post using any number. For example, through normal web application use, the user creates a user record with an ID of 45 (viewitem/id/45). This means the user automatically knows there are 44 other users. And unless you have a correct authorization system in place they can see the other user's information by created their own url (viewitem/id/32).

2a. Use proper authorization. 2b. Use GUIDs for primary keys.

David Medinets
A: 

showing the key itself isn't inherently bad because it holds no real meaning, but showing the means to obtain access to an item is bad.

for instance say you had an online store that sold stuff from 2 merchants. Merchant A had items (1, 3, 5, 7) and Merchant B has items (2, 4, 5, 8).

If I am shopping on Merchant A's site and see: http://server/viewitem.aspx?id=1

I could then try to fiddle with it and type: http://server/viewitem.aspx?id=2

That might let me access an item that I shouldn't be accessing since I am shopping with Merchant A and not B. In general allowing users to fiddle with stuff like that can lead to security problems. Another brief example is employees that can look at their personal information (id=382) but they type in someone else id to go directly to someone else profile.

Now, having said that.. this is not bad as long as security checks are built into the system that check to make sure people are doing what they are supposed to (ex: not shopping with another merchant or not viewing another employee).

One mechanism is to store information in sessions, but some do not like that. I am not a web programmer so I will not go into that :)

The main thing is to make sure the system is secure. Never trust data that came back from the user.

Arthur Thomas
+1  A: 

Using IDs in the URL is not necessarily bad. This site uses it, despite being done by professionals.

How can they be dangerous? When users are allowed to update or delete entries belonging to them, developers implement some sort of authentication, but they often forget to check if the entry really belongs to you. A malicious user could form a URL like "/questions/12345/delete" when he notices that "12345" belongs to you, and it would be deleted.

Programmers should ensure that a database entry with an arbitrary ID really belongs to the current logged-in user before performing such operation.

Sometimes there are strong reasons to avoid exposing IDs in the URL. In such cases, developers often generate random hashes that they store for each entry and use those in the URL. A malicious person tampering in the URL bar would have a hard time guessing a hash that would belong to some other user.

mislav
Deletion should never be done via a GET anyway, so this point is irrelevant. Correct use of HTTP protocols would suggest that a post is used for operations that modify the database.Also, the code that handles the request should verify that the person issuing the rquest has teh authority to carry out the delete.
belugabob
+1  A: 

Security and privacy are the main reasons to avoid doing this. Any information that gives away your data structure is more information that a hacker can use to access your database. As mopoke says, you also expose yourself to SQL injection attacks which are fairly common and can be extremely harmful to your database and application. From a privacy standpoint, if you are displaying any information that is sensitive or personal, anybody can just substitute a number to retrieve information and if you have no mechanism for authentication, you could be putting your information at risk. Also, if it's that easy to query your database, you open yourself up to Denial of Service attacks with someone just looping through URL's against your server since they know each one will get a response.

Regardless of the nature of the data, I tend to recommend against sharing anything in the URL that could give away anything about your application's architecture, it seems to me you are just inviting trouble (I feel the same way about hidden fields which aren't really hidden).

To get around it, we usaully encrypt the parameters before passing them. In some cases, the encyrpted URL also includes some form of verification/authentication mechanism so the server can decide if it's ok to process.

Of course every application is different and the level of security you want to implement has to be balanced with functionality, budget, performance, etc. But I don't see anything wrong with being paranoid when it comes to data security.

John Virgolino
A: 

Everybody seems to be posting the "problems" with using this technique, but I haven't seen any solutions. What are the alternatives. There has to be something in the URL that uniquely defines what you want to display to the user. The only other solution I can think of would be to run your entire site off forms, and have the browser post the value to the server. This is a little trickier to code, as all links need to be form submits. Also, it's only minimally harder for users of the site to put in whatever value they wish. Also this wouldn't allow the user to bookmark anything, which is a major disadvantage.

@John Virgolino mentioned encrypting the entire query string, which could help with this process. However it seems like going a little too far for most applications.

Kibbee
A: 

Hi,

I've been reading about this, looking for a solution, but as @Kibbee says there is no real consensus.

I can think of a few possible solutions:

1) If your table uses integer keys (likely), add a check-sum digit to the identifier. That way, (simple) injection attacks will usually fail. On receiving the request, simply remove the check-sum digit and check that it still matches - if they don't then you know the URL has been tampered with. This method also hides your "rate of growth" (somewhat).

2) When storing the DB record initially, save a "secondary key" or value that you are happy to be a public id. This has to be unique and usually not sequential - examples are a UUID/Guid or a hash (MD5) of the integer ID e.g. http://server/item.aspx?id=AbD3sTGgxkjero (but be careful of characters that are not compatible with http). Nb. the secondary field will need to be indexed, and you will lose benefits of clustering that you get in 1).