ansaurus

Question

SQL Server - Clustered index design for dictionary

Answer 1

+1 A:

If ever possible, try to keep your clustered key as small as possible, since it will be also added to all non-clustered indices on your table.

Therefore, I would use an INT if ever possible, or possibly a combination of two INT - but certainly never a VARCHAR column - especially if that column is potentially wide (> 10 chars) and is bound to change.

So of the options you present, I personally would choose b) - why??

Adding a surrogate DirectoryID will satisfy all crucial criteria for a clustering key:

small
stable
unique
ever-increasing

and your other non-clustered indices will be minimally impacted.

See Kimberly Tripp's outstanding blog post on the main criteria for choosing a good clustering key on your SQL Server tables - very useful and enlightening!

To satisfy your query requirements, I would add two non-clustered indices, one on ObjectID (possibly including other columns frequently needed), and another on ItemKey to search by keyname.

marc_s 2010-10-03 08:29:08

Thanks for pointing out the post, it's enlightening (it may seem more intuitive to try to cluster by what is most often used, but from the article, seems like given real world situations, there are other overheads involved that makes it better to follow these rules on the cluster key!)

andrwo 2010-10-04 08:01:38

marc_s: I got a question for your opinion: would it make sense to use (OBJECTID,DirectoryID) as the cluster-key? This would at least make it cluster by one criteria, while keeping the cluster key smallish (but would lose the inserting always at end of table property). Is it worth it, would you ever do this in your design?

andrwo 2010-10-04 08:25:56

@andrwo: if DirectoryID would be an INT IDENTITY column, then I would cluster on only this single INT - no point in adding a second INT to the clustering index, really - or why would you want to do this??

marc_s 2010-10-04 11:31:10

@marc_s: Purely for performance reasons, since the table is huge, figured that it will shortcut the bookmark lookup while accessing by OBJECTID (since if clustered by (DirectoryID) SQL server still need to lookup each item to change it). Actually there are argument both ways depending on what I want to maximize (cluster index fragmentation, space, update performance), but I am just polling generally opinion to see if DB designers typically do this kind of thing, or they usually try their best to stick to the increasing-integer-identity-cluster rule. Thanks for your help so far!

andrwo 2010-10-05 01:41:18

@andrwo: if you want to reduce bookmark lookups, create a non-clustered index on the columns you want to search on, and INCLUDE any additional columns that make sense. I wouldn't "pollute" the clustered index which is so crucial to have good performance on with something unless absolutely necessary....

marc_s 2010-10-05 05:09:52

@marc_s: Ok got you, thank you for your guidance and opinions on this.

andrwo 2010-10-07 13:12:06

ansaurus

tags:

views:

answers:

SQL Server - Clustered index design for dictionary

related questions