views:

3559

answers:

5

Which sql data type should we use for number bases primary key:

  1. int
  2. bigint
  3. numeric
  4. float
+8  A: 

Generally, int.

bigint if you think you'll have more rows than there are atoms in the universe.

uniqueidentifier is useful if you need globally unique keys (keys that are guaranteed to be unique across all tables in your schema, maybe even universally unique (I don't remember))

The other two I wouldn't use they're not integral types (they have fractions, which just don't make a lot of sense as keys)

Ken Gentle
If you choose a uniqueidentifier, then consider making it a NONCLUSTERED primary key, otherwise inserts can have performance issues.
Brannon
Thanks for the upgrade, Brannon.
Ken Gentle
int is limited 'only' around 2 billions.I already reached this limit with some logging features ;)
Mose
+1  A: 

One huge reason to not use GUIDs for PKs is their terrible fill ratio for index pages - such misuse can dramatically increase your I/O performance costs. GUIDs should be left as AK's and instead drive queries with int-derived PK's wherever possible.

stephbu
That really is a bit inaccurate - GUID as a PK is okay- GUID as the CLUSTERING KEY is a desaster. The PK per se has no effect on the physical data organisation - that's the clustering key's job :)
marc_s
A: 

for 32-bit processors, an int is likely to be the most efficient size for processing.

le dorfier
The word boundaries are less relevant than the space occupied switching up from Int to BigInt - unless you expect a row count in the billions, save the row bytes and spend them elsewhere. I/O cost is several orders more expensive than memory access.
stephbu
Agreed, I'm just adding another reason to use int.
le dorfier
+3  A: 

You really need to keep two separate issues apart:

1) the primary key is a logical construct - one of the key candidates that uniquely and reliably identifies a row in your table. This can be anything, really - an INT, a GUID, a string - pick what makes most sense for your scenario.

2) the clustering key (the column or columns that define the "clustered index" on the table) - this is a physical storage-related thing, and here, a small, stable, ever-increasing data type is your best pick - INT or BIGINT as your default option.

By default, the primary key on a SQL Server table is also used as the clustering key - but that doesn't need to be that way! I've personally seems massive performance gains over time when breaking up the previous GUID-based Primary Clustered Key into two separate key - the primary (logical) key on the GUID, and the clustering (ordering) key on a separate INT IDENTITY(1,1) column.

The index fragmentation was down to minimal levels, and thus the index seek performance was was up - highly recommended !

Marc

marc_s
Yeah totally agreed Marc - just it's pretty rare for average folk to see beyond the diagramming tools, and it's oh so easy to to right-click, set PK...
stephbu
+1  A: 

unsigned int of whichever size that meets your particular needs

Joe Philllips
Except SQL server doesn't have unsigned datatypes.
Joe