views:

96

answers:

3

Possible Duplicates:
Surrogate Vs. Natural/Business Keys
When not to use surrogate primary keys?

So, I would prefer having an INT as a primary key with AI. But, here is the reason why I am considering making a unique string as a primary key: I don't have to query related tables to fetch the primary key since I will already know it.

For example:
I have a many to many relation:

Customer - Order - Product

Let's say I want to add a new customer and a new order, and I already know what they bought. I have to do a query on product table to get the INT, but if I have the string (unique) that is a primary key I don't have to do the query (this seems cleaner to me, I am not talking about optimization/run-time-speeds, not at all).

A: 

If you are not worried about optimization, the main 2 criteria for primary key are:

  • Uniqueness

  • Constanteness (never changes)

E.g. if your problem domain is - and will always be - such that 2 product names are always distinct, AND that no product will ever change its name (think Norton Antivirus -> Symantec Antivirus for a simple example of name change), then you may use the product name as the unique key.

The two MUST be 100% true not only today, but for any foreseeable future lifetime of the database.

Therefore using numeric ID is highly recommended as you may not always be able to forecee such things - and changing the DB structure later on to have a product ID is of course orders of magnitude worse than a minor inconvenience of needing to map and ID from the name in your queries.

DVK
Check and check. I am wondering though if anyone has done this and wish they hadn't.
Zombies
I have, and I have regretted it once my functional spec changed so that the key needed to mutate. Epic FAIL :)
DVK
+1  A: 

If you can guarantee that your VARCHAR field is indeed unique and hopefully stable (not changing), then you can definitely use it as a primary key without any conceptual problems.

The only real reasons against using it as your primary key (or even more importantly: your clustering key in SQL Server) are indeed performance-based. A wider and varying size clustering key is suboptimal in many ways, and affects not just your table and its clustering index, but also all non-clustered indices on that table. But if that's none of your concern, again - you'll be fine with a VARCHAR as your primary key.

marc_s
Well, let me ask this... going back to the many to many example. Would it be considered bad since I am duplicating data/actual content? I suppose because it doesn't change then it isn't wrong... It just feels wrong.
Zombies
Yes, obviously, if you use natural keys (e.g. "real" data) for your keys, you'll have a certain duplication when you use those keys in many-to-many tables. The same applies to surrogate INT keys - it just might be less obvious and less "bothering".
marc_s
A: 

Er... both are correct in a way

  • Your logical model and design will use the unique string. This is the natural key.
  • The actual implementation may use a numeric auto number column (surrogate key) because of architecture/performance
gbn
Plausible, but unlikely. The physical model is derived from the Logical one - IME, the only columns not displayed in the Logical model are foreign keys. Switching from natural to artificial key is not a trivial model change.
OMG Ponies
@OMG Ponies: If you use Object Role Modeling, then the design will never use a surrogate key. When you come to implement it, you will add a surrogate key. I think either you misunderstand me or I've used the wrong terms.
gbn
IME (Oracle Designer, SQL Server SSMS doesn't do Logical models), you do define the the pk, be it natural or artificial, in the logical model. What are you using that does Logical models for SQL Server?
OMG Ponies
@OMG Ponies: I'd use something like NORMA: http://www.orm.net/ http://sourceforge.net/projects/orm/
gbn
Took a look, thx - it's similar to the Logical ERDs in Oracle, but the key difference is that attributes aren't defined in ORM. Oracle Designer logical models include both attributes and their data types.
OMG Ponies