views:

382

answers:

8

I have a table with 16 columns. It will be most frequently used table in web aplication and it will contain about few hundred tousand rows. Database is created on sql server 2008.

My question is choice for primary key. What is quicker? I can use complex primary key with two bigint-s or i can use one varchar value but i will need to concatenate it after?

+1  A: 

Why not just a single INT auto-generated primary key? INT is 32-bit, so it can handle over 4 billion records.

CREATE TABLE Records (
   recordId INT NOT NULL PRIMARY KEY,
   ...
);
Kaleb Brasee
+2  A: 

A primary key which does not rely on any underlying values (called a surrogate key) is a good choice. That way if the row changes, the ID doesn't have to, and any tables referring to it (Foriegn Keys) will not need to change. I would choose an autonumber (i.e. IDENTITY) column for the primary key column.

In terms of performance, a shorter, integer based primary key is best.

You can still create your clustered index on multiple columns.

Mitch Wheat
A: 

The decision relies upon its use. If you are using the table to save data mostly and not retrieve it, then a simple key. If you are mostly querying the data and it is mostly static data where the key values will not change, your index strategy needs to optimize the data to the most frequent query that will be used. Personally, I like the idea of using GUIDs for the primary key and an int for the clustered index. That allows for easy data imports. But, it really depends upon your needs.

Josh
A: 

What do you mean quicker? if you need to search quicker, you can create index for any column or create full text search. the primary key just make sure you do not have duplicated records.

Henry Gao
Actually, a primary key is more a reflection of your domain model and it's relationships....
Mitch Wheat
+5  A: 

There are many more factors you must consider:

  • data access prevalent pattern, how are you going to access the table?
  • how many non-clustered indexes?
  • frequency of updates
  • pattern of updates (sequential inserts, random)
  • pattern of deletes

All these factors, and specially the first two, should drive your choice of the clustered key. Note that the primary key and clustered key are different concepts, often confused. Read up my answer on http://stackoverflow.com/questions/1301165/should-i-design-a-table-with-a-primary-key-of-varchar-or-int/1301536#1301536 for a lengthier discussion on the criteria that drive a clustered key choice.

Without any information on your access patterns I can answer very briefly and concise, and actually correct: the narrower key is always quicker (for reasons of IO). However, this response bares absolutely no value. The only thing that will make your application faster is to choose a key that is going to be used by the query execution plans.

Remus Rusanu
Thanks, if i found earlier and read discussion i wouldn't ask question :)
Siblja
+1  A: 

A surrogate key might be a fine idea if there are foreign key relationships on this table. Using a surrogate will save tables that refer to it from having to duplicate all those columns in their tables.

Another important consideration is indexes on columns that you'll be using in WHERE clauses. Your performance will suffer if you don't. Make sure that you add appropriate indexes, over and above the primary key, to avoid table scans.

duffymo
A: 

Lot’s of variables you haven’t mentioned; whether the data in the two columns is “natural” and there is a benefit in identifying records by a logical ID, if disclosure of the key via a UI poses a risk, how important performance is (a few hundred thousand rows is pretty minimal).

If you’re not too fussy, go the auto number path for speed and simplicity. Also take a look at all the posts on the site about SQL primary key types. Heaps of info here.

Troy Hunt
A: 

Is it a ER Model or Dimensional Model. In ER Model, they should be separate and should not be surrogated. The entire record could have a single surrogate for easy references in URLs etc. This could be a hash of all parts of the composite key or an Identity.

In Dimensional Model, also they must be separate and they all should be surrogated.

srini.venigalla