ansaurus

Question

Defining indexes: Which Columns, and Performance Impact?

Answer 1

+3 A:

You can only have one clustered index per table. It doesn't have to be the primary key, but in most cases it will be.

Beyond that - it really depends on the queries & the tipping point for what indexes will be used. But defining indexes also means there will be an impact to DML - inserts, updates & deletes will take a slight performance hit.

Should I use composite non clustered index(es) on columns that are foreign keys?

Doesn't matter what the column is, it's the usage that matters for the optimizer to determine what index, clustered or otherwise, to use.

OMG Ponies 2009-12-23 18:11:08

Answer 2

+2 A:

Yes, you can overload your tables with too many indexes. In general, every additional index costs performance time in terms of index maintenance. Tables that are heavily updated should generally have fewer indexes.

Another broad rule (from Richard Campbell, on RunAs Radio and DotNetRocks), is that a few broad indexes will perform better than a larger number of narrow indexes. A broad index will cover a wider range of queries, and there's less for the query optimizer to investigate. Remember that the query optimizer has a limited time to run.

Investigate SQL Server Profiler. There are tools there (used to be stand-alone, but they've changed and I haven't used them recently). They can analyze workloads and make indexing recommendations. These will be better choices than indexes picked "intuitively."

Cylon Cat 2009-12-23 18:12:13

Answer 3

A:

If you have queries that are referencing columns that are not in your index, the SQL server engine will have to perform a table lookup to get the non-included columns from the actual table.

If you are running these queries often, you should create non-clustered indexes that "cover" the query by including all the referenced columns in the index. This should include any non-unique columns.

Adding indexes to a table always decreases write performance, since the index will have to be updated every time the table is updated.

womp 2009-12-23 18:12:20

Answer 4

A:

What fields are you doing lookups on? Searching? Etc. Determine what fields you are using when running your queries (WHERE clause) and they could possibly be good candidates.

For instance, think of a library. The book catalog has a clustered index for the ISBN number and a non clustered index for say publishing year, etc.

Also what helped me is something that Bart Duncan posted a long time ago. He deserves the credit for this.

The article was entitled "Are you using SQL's Missing Index DMV's?". Look it up and run this query:

SELECT 

  migs.avg_total_user_cost * (migs.avg_user_impact / 100.0) * (migs.user_seeks + migs.user_scans) AS improvement_measure, 

  'CREATE INDEX [missing_index_' + CONVERT (varchar, mig.index_group_handle) + '_' + CONVERT (varchar, mid.index_handle) 

  + '_' + LEFT (PARSENAME(mid.statement, 1), 32) + ']'

  + ' ON ' + mid.statement 

  + ' (' + ISNULL (mid.equality_columns,'') 

    + CASE WHEN mid.equality_columns IS NOT NULL AND mid.inequality_columns IS NOT NULL THEN ',' ELSE '' END 

    + ISNULL (mid.inequality_columns, '')

  + ')' 

  + ISNULL (' INCLUDE (' + mid.included_columns + ')', '') AS create_index_statement, 

  migs.*, mid.database_id, mid.[object_id]

FROM sys.dm_db_missing_index_groups mig

INNER JOIN sys.dm_db_missing_index_group_stats migs ON migs.group_handle = mig.index_group_handle

INNER JOIN sys.dm_db_missing_index_details mid ON mig.index_handle = mid.index_handle

WHERE migs.avg_total_user_cost * (migs.avg_user_impact / 100.0) * (migs.user_seeks + migs.user_scans) > 10

ORDER BY migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks + migs.user_scans) DESC

It is not the ultimate solution for you but it will help you determine some indexes. And the link to the article: http://blogs.msdn.com/bartd/archive/2007/07/19/are-you-using-sql-s-missing-index-dmvs.aspx. By default when you create a PK in SQL Server it by default is the clustered index, it doesn't have to be, but it generally is.

JonH 2009-12-23 18:14:00

Answer 5

+5 A:

The clustered index defines your table's physical structure (to a certain degree) - e.g. it defines in what order the data is ordered. Think of the phonebook, which is "clustered" by (LastName,FirstName) - at least in most countries it is.

You only get one clustered index per table - so choose it wisely! According to the gospel of the Queen of Indexing, Kimberly Tripp, the clustering key should be narrow, stable (never change), unique (yes!) and ideally ever-increasing.

It should be narrow, because the clustering key will be added to each and every entry of each and every non-clustered index - after all, the clustering key is the value used to ultimately find the actual data.

It should be stable since constantly updating lots of index values is a costly affair - especially since the clustering key would have to updated in all non-clustered indices as well.

It needs to be unique, since again - it's ultimately the value used to locate the actual data. If you choose a column that is not guaranteed to be unique, SQL Server will "uniquify" your clustering key by adding a 4-byte value to it - not a good thing.

And ideally, the clustering key should be ever-increasing since that causes the least page and index fragmentation and thus is best for performance.

The ideal candidate for a clustering key would be a INT (or BIGINT) IDENTITY - it ideally fulfills all those requirements.

As for non-clustered indices - use and choose them wisely! There's only one general rule I can give you: all columns that are part of a foreign key (referencing another table) should be in an index - SQL Server will not (contrary to popular belief and lots of myths) put such an index in place automatically - never has, never does.

Other than that - you need to watch your system, see what kind of queries you have - all columns that show up in a WHERE or SORT clause are potential candidate to be indexed - but too many indices isn't a good thing either....

marc_s 2009-12-23 18:20:31

Wish I could vote this one up more than once. Great answer!

Cylon Cat 2009-12-23 19:10:56

@CylonCat: thanks so much! Glad it is useful to someone

marc_s 2009-12-23 19:26:02

+1 for very gud explanation. Didn't accepted bcoz OMG Ponies provided a very nice link to know more about non Clustured indexes

Shantanu Gupta 2009-12-23 19:30:23

Yes, I saw that - another wonderful Kimberly Tripp blog post - her stuff is really absolutely outstanding.

marc_s 2009-12-23 19:37:02

Answer 6

A:

If you should or not make clustered indexes depends on you workload (usually dominated by the amount and kind of SELECT statements hitting your table)

A clustered index will force the disk storage order of the rows to be according to the clustered index values. (For this reason, there can be only 1 clustered index per table, as rows are stored on disk only once) This makes sense if most of your queries are always demanding a group of related rows.

Example: suppose you are storing CustomerOrders, and you frequently want to know the number of CustomerOrders (regardless of the customer) in a certain time period. In this case it may be useful to create a clusterd index with the OrderDate as first column. If on the other hand you are frequently looking for all CustomerOrders with the same CustomerId, it makes more sense to put the CustomerId as first column in your clustered index.

The disadvantage of clustered indexes is not in de clustered index itself, but on the secondary indexes: secondary indexes are themselves not clustered (by definition, as the rows can only be stored once, and are stored in order of the clustered index), and their index entries point to the index entries of the clustered index. So to retrieve a row via a secondary index requires 2 read operations: one of the secondary index, and then one of the clustered index it it pointing to.

Roland Bouman 2009-12-23 18:21:30

I m intrested only in Non Clustured Index. I want to xpore NCI as they can be to many

Shantanu Gupta 2009-12-23 18:29:01

ansaurus

tags:

views:

answers:

Defining indexes: Which Columns, and Performance Impact?

related questions