views:

265

answers:

15

Hello

Imagine that I have a Customer table.

And in this table is an ID column, Primary Key int, Identity, all that jazz. Our Customer also has a Name.

Now, if I was to send out information in a report, or an e-mail blast, where there was a link back into my website to uniquely identify that Customer, would it be good practice to use the Primary Key ID field?

All my instincts are telling me this is awful, and that surrogate IDs should PURELY be used for the database only - and never exposed to the outside world in this way.

But I'm looking for that killer argument that I could use in the future, if (when) presented with this case.

Thoughts?

+4  A: 

A unique identifier is a unique identifier and if you already have on in the primary key for your database table, use it. Creating another surrogate key would be redundant unless you have GUIDs for example and that was too long for your output format.

--- EDIT ----

Just thinking about the mailing reference, having done this in the past I'd created a table that tracked specific mailings to build metrics around the responses. Each mailing record used a GUID as a primary key to keep it unique and that record linked back to the customer record, no details from the customer record were presented back to the viewer of the link.

If you are presenting customer details then there needs to be access control, even if you obfuscate the ID, anyone getting hold of that link will have access to that data and in most systems that would be unacceptable.

I think the decision starts with 'how secure does this data need to be?' and moves on from there.

Lazarus
A: 

Yes, it is purer / cleaner / nicer to use something like a username instead of the actual primary key for external links. However, if there are no "natural" identifier to use, then use the primary key. As long as you are using it for identifying a row, and not putting some semantic value in it, there's nothing wrong with that.

Rasmus Kaj
+1  A: 

It is ok to use a key in the form that you are using. However, in your particular case the problem exists that the url could be played with and people could see other peoples information. It depends on what is at that url as to whether or not this would be a problem for you or not. If it is then I suggest that you toss some salt in your url (a second number, a guid, the customers email hashed, etc.) that you could then match with the ID to verify that the URL is correct. Even using IDs that are not 1, 2, 3...instead using every 7th ID...is still guessable. The other side of that is that you could just has the ID if you were so inclined...or the whole URL for that matter. Your information generally has no business being consumed by the outside world unless you really want them too. In this case your instinct was right in that the data should not be consumed by humans. But that URL is part of your system so it is really just a matter of making the information useless to the people reading the url.

Andrew Siemer
I'm adding this here but it applies to many of the responses here. I'd think this is more an issue of access control for the site as a whole rather than the legibility of the 'key'. Whether it's an identity, rowguid or the textual name, they are all open to abuse and fall into the 'don't trust the incoming data' approach of loosely coupled apps, i.e. verify authenticity and permission to access the data. If you have your system wide open and rely on obfuscated references into it then you only have yourself to blame when someone strips your data bare. :)
Lazarus
+1 I couldn't agree more!
Andrew Siemer
+1  A: 

The way you describe it, it sounds as if a user could look into other users data just by modifying the primary key value in the link. If that's the case, generating a unique id (GUID for example) for external purposes would be the better approach.

longeasy
A: 

It is not a good idea to expose plain ID to the public. The way I work around it is by encrypting the query string parameter.

Ender
A: 

I'll start with this. Using your surrogate key as a link back to some database object is VERY common. In fact, many popular frameworks do exactly that out of the box (ASP MVC, Ruby on Rails, CodeIgniter).

However, you might look into using some natural key instead. You could also try Base64 encoding the ID field. Even small measures like this will help prevent someone from "exploring" your database.

Brad Gignac
I'm not sure where ASP.NET MVC uses surrogate keys to link back to a database object, can you give an example. Not being pedantic, I'm using MVC for a project at the moment and am still exploring the framework.
Lazarus
+1  A: 

Create a new field, called CustomerNumber which is unique in the system, but unrelated to the ID/primary key.

Edit: if you're sending this type of information out by e-mail, use some kind of encoding or encryption so that a random user couldn't guess another customer's information. It's really immaterial what the source of the key is, as long as it's unique. My first suggestion was just a general one for use in a system that deals with customers.

Jon Seigel
See, this is what I'd go with. But if someone questions this and says "Why? Just use the freakin' ID" - what do I come back with?
Duncan
Why would you NOT use the freakin' ID?
Joe Philllips
There was a question on here a few months back where someone had implemented a customer number using the ID, and then the business decided that the customer numbers should be character strings. You can see where I'm going with that... (I really felt sorry for that guy.)
Jon Seigel
... and that's why every field in the db should be a varchar. You know, just in case. I kid, I kid!
akmad
@d03boy - very good.@Jon Seigel - that's what my instinct is telling me!
Duncan
@Duncan You refute that with the need to protect customer information. However if it's really important it's needs to be protected by an authentication system and not just an obfuscated customerID. However if it's a sales email blast, you would kill the conversion rate if they had to click the link and login to see the page instead of just click the link
Chad
Int to varchar conversion is usually possible without loss of data ;)
Lazarus
@Lazarus: Yep. But that's the tip of the iceberg needed to change the entire system.
Jon Seigel
A: 

Certainly if its open to the public and they can change that ID which might have an effect on your data in the database then its a big no.

I dont see too much of a problem if its a simple number, which if changed manually will show another page with different information in it.

Remember, all the top shopping sites (e.g. Amazon) use ID's to identify products and you can manually change that ID to see a differnent page. (Of course i dont know if they have an internal number which you never see! ;))

kevchadders
+4  A: 

I'd suggest not adding a new field, but also don't expose the Id.

Create a table to store emails that you send out. Include a GUID as the primary key and UserId as a foreign key. In the URL you are giving them, link back to the GUID. Then if a user clicks on a URL you can updated the record based on the GUID and indicate when the user accessed the URL.

Bob
You could also create a new GUID for each mailing and be able to perform rough analytics on email sends and responses.
skirmish
+1  A: 

Consider that end-users might find it more easy to navigate your site if they were able to change urls because they contained readable strings, and not just ids. This is a more rest-y approach.

On the other hand, why not expose the id? If your application is vulnerable simply because someone reasoned out a primary key, then you have problems.

wsorenson
+2  A: 

Use A Surrogate Key because you don't want people to be able to guess(and with a incrementing PK it's not really guessing) someone else's ID and see their information. I'd recommend added a key to your customer table.

The following is example code to add a new column that generates a guid for each record.

BEGIN TRANSACTION
GO
ALTER TABLE dbo.Customers ADD
    PublicCustomerID uniqueidentifier NULL
GO
ALTER TABLE dbo.Customers ADD CONSTRAINT
    DF_Customers_PublicCustomerID DEFAULT newid() FOR PublicCustomerID
GO
ALTER TABLE dbo.Customers SET (LOCK_ESCALATION = TABLE)
GO
COMMIT

Surrogate keys also work nicely for an email blast for providing a reasonable amount of protection of non-critical customer information.However if it's really important it's needs to be protected by an authentication system and not just an obfuscated customerID. However if it's a sales email blast, you would kill the conversion rate if they had to click the link and login to see the page instead of just click the link

Chad
If your "security mechanism" relies on not being able to guess a simple value then you're not really implementing security at all
Joe Philllips
Can't you guess a surrogate key?
Joe Philllips
For trivial protection a guid will provide a reasonable amount of protection. For an email blast this should suffice -- since sensitive information shouldn't be obtainable by intercepting an email. A authentication mechanism should be used for that type of data.
Chad
A: 

You can implement a security mechanism without requiring the users to login. You should not simply rely on obscurity to secure this data.

Requiring the ID in the URL as well as some sort of password (a randomly assigned hash) could be all the security needed.

Joe Philllips
A: 

A thing to consider is to have a primary key and a natural key in the database. Primary key uniquely identifies a row in the table, while natural key uniquely identifies a customer. If you decide to keep customer history, you can add a new row with a new primary key, but same natural key for the customer -- then you mark a row "expired" or "current" in a separate column. This is frequently used in data warehousing for customer dimensions. In this case you would expose only natural keys, not primary keys. Joins still happen on primary keys.

Damir Sudarevic
A: 

The strong argument against using the primary key is it's easily hackable, well, hardly even hacking - just a matter of guessing input by changing the number a little. You're even exposing yourself unintentionally to a miss-type if the user tries to enter the link somewhere else manually - say if they print the email off to give to their boss who then types in the code for customer B instead of A.

That alone should be sufficient. At the very least I'd add a check-digit to the end of the code you send out to the users - the usual recursively add the digits algorithm is cheap and helps (a little).

Myself however I'd err towards using a generated GUID. Thanks to Microsoft and many others the concept of entering a 36 character string isn't as offensive to users now as it once was, most people will just be hitting a link anyway, and you have a simple, to hand. function that guarantees uniqueness so you're never going to trip over yourself in future. Simple and robust.

Cruachan
A: 

Exposing the PK is a security risk. Users can go on a fishing expedition, and just try out a few interesting keys, and see what they can see this way. Let's say the link you're sending out is http://www.example.org/ShowCustomer?id=4711 . By looking at this report, a user only finds out info about customer #4711. A competitor could easily try out all customer ids starting at 1: http://www.example.org/ShowCustomer?id=1 and thus have a list of all of your customers. And what about id=0? or id=-1? Or SQL injections like http://www.example.org/ShowCustomer?id=1;DROP%20TABLE%20CUSTOMERS ?

Carsten Kuckuk
if those injections are possible then you have a much bigger problem than PKs.
FreshCode