views:

157

answers:

6

In a SQL Server 2005 database, I have lots of tables like this Products table

ProductID (PK)
ProductCategoryID (IX)
Description
Price
ExpiryDate
BreakableYN

...where there is a primary key, a foreign key and then a bunch of other fields. Another characteristic of this type of table is that lots of queries only use the 2 ID fields (ProductID, ProductCategoryID), e.g. Employees JOIN EmployeeProductJoin JOIN Products JOIN ProductCategories JOIN ProductDepartments.

If ProductID and ProductCategoryID are already indexed, is it worth adding another index for ProductID, ProductCategoryID?

I know it seems that I'm asking if adding a covering index will help, but what I'm really asking is whether a covering index will help if the fields in that covering index are already indexed individually.

These are definition tables that are not huge, so I'm not worried about adding extra time to INSERTs etc.

+1  A: 

Yes it might. The point of a covering index is that a query can be served by the index alone, without having to access the table. So you include not only the fields on which you are searching but also the fields you want to return, and the query optimizer can avoid accessing the table at all.

You might not really mean "covering index" though...

David M
Is it not a covering index? I thought as long as you build an index that contains all the fields that a query needs, you have a covering index.
Craig HB
Depends on what columns you are selecting...
David M
A: 

Only the query plans (with and without the extra indices, and with tables containing realistic amounts and kinds of data) can tell you for sure if the extra indices will help; it's all about helping the query optimizer find a smarter plan, but you can only help so far, and it is conceivable that it may fail to find the plan you'd like (it's but a heuristic "let me try to optimize" engine, after all). That's why looking at query plans is so important (and you need to have realistic data, because that usually does influence the heuristics!).

Alex Martelli
A: 

I definitely may help, especially if your descriptions are large. It would be easy to benchmark and see for yourself. This new index may be much smaller that the clustered one. But you only want to have this narrow index if you have highly important queries which you need to speed up no matter what.

AlexKuznetsov
+1  A: 

Is the primary key clustered? If it is, then adding a new index will accomplish nothing, because the ProductCategoryID index will already contain the ProductID values, so it effectively "covers" both columns.

Christian Hayter
Downvoted: This new index may be much (maybe 100 times) smaller that the clustered one.
AlexKuznetsov
Alex, that is irrelevant. All nonclustered indexes contain the clustering key. A nonclustered index on PCID will contain PID+PCID data, and a nonclustered index on PID+PCID will also contain PID+PCID data, but the PID will be duplicated in the index key and also the row pointer. Therefore it's redundant.
Christian Hayter
Yes, the primary key (ProductID) is clustered. Just want to make sure I'm clear about your response. When I create an index on ProductCategoryID, that index will consist of ProductCategoryID AND ProductID (because ProductID is the clustered index). So, effectively, I already have my ProductID, ProductCategoryID index!
Craig HB
Sorry, I should have explained in more detail. If a clustered index (CI) exists on a table, then all nonclustered index (NCI) leaf nodes contain the CI value (plus a unique row ID *if* the CI is not itself unique). This means that, when SQL Server uses a NCI to find data, then it already knows the CI value without having to access the table or go to the CI directly. Hence there is no need to include the CI columns in any NCI.
Christian Hayter
Great answer -- thanks
Craig HB
A: 

Yes, it can help in one specific way. The idea of a covering index is that it has some redundant fields that you are using in queries. If the index can satisfy the data requirements of a query without the query having to hit the underlying table you can save on I/O by getting the data from the index.

Where you have two indexes as you show above the DBMS would have to hit the table as well as resolving two index seeks.

If your query results are widely scattered on the table but belong together on the index you could potentially save quite a lot of I/O on a large query. In this way, covering indexes can also be used as a sort of 'second clustered index' on a table.

ConcernedOfTunbridgeWells
I'm confused by your answer so will presume I have misread it. The principle of a covering index is to incorporate all columns required by a query, within a single index. Why then would there be redundant columns in the query i.e. not satisfied by the index, as this would not be a covering index? Perhaps you are suggesting that a covering index may also contain additional columns as well as those that are required by the query and therefore still provide coverage?
John Sansom
The columns are redundant for indexing purposes - that is, the additional columns are not necessary to resolve the selection but contain data that is returned by the query.
ConcernedOfTunbridgeWells
A: 

In short Yes, it will improve query performance.

Using a covering index, all of the columns required in your query are present in the Index data structure. This means that SQL server need only query a single index in order to provide the results for your query.

Whereas when you have a scenario of multiple columns, that are indexed separately, in order to serve this query SQL Server will more than likely have to perform a seek/scan of numerous indexes as opposed to just the one. This of course potentially creates more I/O activity.

Make sense?

John Sansom