views:

172

answers:

2

Hello,

i'm using SQL Server 2008 and, in one of my tables, i implemented a (clustered) primary key on its identifier. Here's the table description:

Identifier: - IdentifierId int NOT NULL PK - Alias nvarchar(200) - DataType int NOT NULL

I made two indexes: one for Alias and another one for DataType. However, I just noticed something strange. When running the following query:

SELECT * FROM IDENTIFIER WHERE DataType = 1

The query actually runs slower with the indexes and the primary key than without them; it takes approximately 10 seconds longer! The indexes are not fragmented - i checked - and i'm also using this

GO
CHECKPOINT;
GO
DBCC DROPCLEANBUFFERS;
GO
DBCC FREEPROCCACHE;
GO

before the query itself.

This table is rather large, having several million entries on it. Indexes and PKs play a vital role on most queries, but in this case i can't understand why the query is running slower with them. Any ideas?

Thanks in advance

EDIT: The execution plan shows that only the clustered index is being used and the DataType variable currently goes up to 150.

+4  A: 

Create a composite Non-clustered covering index DataType and include Alias, and remove the individual indexes on Alias and DataType columns:

CREATE NONCLUSTERED INDEX NC_DataType_I_Alias 
    ON tblIdentifier(DataType) INCLUDE (Alias)
Mitch Wheat
Will do, thanks
+4  A: 

When you select *, the system will still needs to get every column. So the optimizer will often determine that it is quicker to use the clustered index scan (remember a clustered index is not really an index - it's just data organized by the order specified) than to use a seek on a different index combined with a bookmark lookup based on primary key to retrieve the additional rows.

So the key to performance is to have a non-clustered index (this is a bad name, really, since non-clustered indexes often far outperform clustered indexes) and INCLUDE additional columns in the index so that it becomes covering. In the case of pre-SQL Server 2005, you simply have to add the columns to the end of the index.

So, basically, a primary key is good, but should not necessarily determine your choice of clustering, and you will generally need to rely on non-clustered indexes (with appropriately INCLUDEed columns) for performance on most selective operations, with the clustered index designed for the least selective and most commonly sorted case.

Cade Roux
+1 for good explanation
Greg D
agreed, +1 also
Hal