ansaurus

Question

Index view: How to choose the Clustered Index?

Answer 1

+4 A:

Since it's an index, you have to pick a column (or set of columns) which is guaranteed to be non-null and unique in all cases. That's the biggest and most stringent criteria - anything that might be NULL or duplicate is out of the question right from the get-go.

Depending on the type of queries you'll be running on this indexed view, you might also want to see if you have any columns (e.g. a DATE or something) that you'll be running range queries against. That might make an interesting candidate for a clustering key.

But the main thing is: your clustering key must be unique and non-null in any circumstance. And in my personal experience, to reduce index size (and thus increase the number of entries per page), I'd try to use as small a key as possible - a single INT is best, or a combination of two INTs - or possibly a GUID - but don't use VARCHAR(500) fields in your clustering key!

UPDATE: to all those poster who keep telling us clustered indexes don't need to be unique - check out what the "Queen of Indexing", Kimberly Tripp, has to say on the topic:

Let's start with the key things that I look for in a clustering key:
* Unique
* Narrow
* Static
Why Unique?
A clustering key should be unique because a clustering key (when one exists) is used as the lookup key from all non-clustered indexes. Take for example an index in the back of a book - if you need to find the data that an index entry points to - that entry (the index entry) must be unique otherwise, which index entry would be the one you're looking for? So, when you create the clustered index - it must be unique. But, SQL Server doesn't require that your clustering key is created on a unique column. You can create it on any column(s) you'd like. Internally, if the clustering key is not unique then SQL Server will “uniquify” it by adding a 4-byte integer to the data. So if the clustered index is created on something which is not unique then not only is there additional overhead at index creation, there's wasted disk space, additional costs on INSERTs and UPDATEs, and in SQL Server 2000, there's an added cost on a clustereD index rebuild (which because of the poor choice for the clustering key is now more likely).

Source: http://www.sqlskills.com/blogs/kimberly/post/Ever-increasing-clustering-key-the-Clustered-Index-Debateagain!.aspx

marc_s 2010-02-05 11:58:24

So, the clustered index should act like a primary key in the tables? By the way, thanks for the revision of "SQL Server" name. I'm still little confused by all the names that Microsoft keep coming up with :)

anthares 2010-02-05 12:04:31

SQL Server will use your primary key on a table as its clustering key by default. A view doesn't have a primary key per se - but the clustering key which you can set up should also uniquely identify each row - that's its job.

marc_s 2010-02-05 12:06:34

I think both claims in this answer are wrong: a clustered index does not have to be unique, and a clustered index does not occupy any space on disk, so reducing clustered index size does not have any effect

Andomar 2010-02-05 12:20:00

Answer 2

+1 A:

The thumb rule: Select the columns which are you are probably going to use MOST in your queries as WHERE, GROUP etc. Those columns could be a good candidate for non-clustered indexes. Select a column (or a group of column) which would probably make your row unique, and that could be a good candidate for clustered index.

As mentioned by marc, a clustered index imposes a unique constraint, so it definately needed that the column you selct should not have any null and duplicate.

Bhaskar 2010-02-05 12:13:07

Answer 3

A:

A clustered index does not have to be unique. The columns in it can even be nullable. For example, this will run without an error:

create table  #test (col1 int identity, col2 int)
create clustered index ix_test on #test (col2)
insert into #test (col2) values (1)
insert into #test (col2) values (1) -- Duplicate in clustered index
insert into #test (col2) values (null)

A clustered index is part of the table structure on disk. As such, a clustered index uses no additional disk space.

By default, SQL Server clusters on the primary key, which is usually a good choice. You can change that if you have intensive queries with a lot of table lookups. Changing which index is clustered can eliminate table lookups.

Andomar 2010-02-05 12:18:09

I checked myself that there is no problem creating clustered index with duplicate keys ... The question is, what is the best practice and how can I take the best of my clustered index, so to speak.I should make clarification here, that the tables I use in the view are with very intense use and any overhead I take for maintance of the indexes, should be paid-off.

anthares 2010-02-05 12:44:03

The general advice is: make your primary key the clustered index. To give specific advice, we'd need far more information, at least table layout, relations, and the queries that run

Andomar 2010-02-05 13:28:08

@Anthares: **YES** the clustered index must be unique! That's how SQL Server actually finds your data. If you supply a column or set of columns which is not guaranteed to be unique, SQL Server will add a 4-byte uniquifier to your clustering key. Read Kim Tripp's blog - esp. this entry here: http://www.sqlskills.com/BLOGS/KIMBERLY/post/The-Clustered-Index-Debate-Continues.aspx

marc_s 2010-02-05 14:11:23

@Andomar: and sure - the clustered index also builds an index tree and thus takes up space! Yes, the leaf level of a clustered index are the actual data pagges - but the intermediary pages of the clustered index are indeed index pages, which will take up space.

marc_s 2010-02-05 14:12:16

@Andomar: also, if you have a number of non-clustered indices (not likely on a clustered view, but certainly on a table), the clustering key (all of it) will be added to each and every entry of each and every non-clustered index. If the clustering key is unnecessarily big, you'll be wasting **a lot of space** , not just on disk, but also in SQL Server main memory!

marc_s 2010-02-05 14:14:02

@marc_s: I agree that a clustered index uses a small amount of extra disk space, but it's neglegible compared to a nonclustered index

Andomar 2010-02-05 15:07:58

ansaurus

tags:

views:

answers:

Index view: How to choose the Clustered Index?

related questions