views:

75

answers:

7

Hello all,

Defining a column to be a primary in table on SQL Server - will this make inserts slower?

I ask because I understand this is the case for indexes.

The table has millions of records.

Thanks all for any help

+1  A: 

Regardless, that is not why you would define a primary key on a table.

You define a primary key when it is REQUIRED by the domain model.

Mitch Wheat
Would the downvoter please leave a comment. Thanks.
Mitch Wheat
A: 

Primary keys are automatically indexed, clustered if possible and failing that non-clustered.

So in that sense inserts are slightly affected, but of course having no primary key would usually be much much worse, assuming the table needs a primary key.

First measure, identify a problem and then try to optimize. Optimizing away primary keys is a very bad idea in general.

Vinko Vrsalovic
A: 

Not enough to create a perceptual performance hit, and the benefits far outweigh the very minor perforamnce issues. There are very very few scenarios where you should not put a primary key on a table.

Charles Bretana
A: 

Yes, adding a primary key to a table will slow inserts (which is OK, because not adding a primary key to your table will speed up your application's eventual catastrophic failure).

If what you're doing is creating a new table and then inserting millions of records into it, there is nothing wrong with initially creating the table without a primary key, inserting all the records, and then creating the primary key. Or use an alternative tool to perform a bulk insert.

MusiGenesis
A: 

Yes, inserts are slowed, especially with several clients doing inserts simultaneously, and more so if your key is sequentially increasing (all inserts occur at the right-most nodes of the index tree, in most database implementations, or at the last page of the table for e.g. clustered SQL Server indices -- both of which scenarios cause resource contention).

That said, SELECTs using the primary key are speeded up quite a bit, and the integrity of your key is guaranteed. Do the right thing first (define primary keys everywhere). Second, measure to see if you cannot meet your performance targets, and whether this is caused by your data integrity constraints. Only then consider workarounds.

Pontus Gagge
+4  A: 

No, not necessarily! Sounds counter-intuitive, but read this quote from Kim Tripp's blog post:

Inserts are faster in a clustered table (but only in the "right" clustered table) than compared to a heap. The primary problem here is that lookups in the IAM/PFS to determine the insert location in a heap are slower than in a clustered table (where insert location is known, defined by the clustered key). Inserts are faster when inserted into a table where order is defined (CL) and where that order is ever-increasing.

So actually, having a good clustered index (e.g. on a INT IDENTITY column, if ever possible) does speed things up - even insert, updates and deletes!

marc_s
Only if the PK gets a clustered index. Great tip, +1.
Vinko Vrsalovic
@Vinko Vrsalovic: which is the system-default on SQL Server, and I bet you - more than 80% of all database tables out in the universe have a clustered PK
marc_s
@marc_s: I'm aware of that, I was just stressing the point out.
Vinko Vrsalovic
+1 I like counter-intuitive answers
Daniel Vassallo
A: 

The really quick answer is:

Yes.

Primary keys are always indexed (and SQL will attempt a clustered index). Indexes make inserts slower, clustered indexes even more so.

Depending on what your table is used for, you may have a couple of options.

If you do a lot of bulk inserts then reads, you can remove the primary key, insert into a heap (if you have SQL 2008 this can be minimally logged to run even faster) then reassign the key and wait for the index to run.

As an addendum to that, you can also insert using an ORDER BY clause which will keep the inserted rows in correct order for the clustered index. This will really only help if you are inserting millions of rows at once from a source that is already ordered, though.

JNK
actually: NO - see my post. Having a clustered primary key makes inserts, updates, deletes **faster** than if you have no clustered, primary key (just a "heap")
marc_s