ansaurus

Question

Answer 1

+1 A:

The order of the index is important for this query:

CREATE INDEX MyIndex ON MyTable (col3 DESC, col2 ASC, col1 ASC)

It's not so much the ASC/DESC as that when sql server goes to match that where clause, it can match on col3 first and walk the index along that value.

Joel Coehoorn 2009-07-09 19:58:22

For this query, the order of the index should be just as @Tom H. created it.

Quassnoi 2009-07-09 19:59:30

I did try several possible orders for the columns. All gave the same result.

Tom H. 2009-07-10 13:46:46

Answer 2

+1 A:

SQL Server optimizer is not good in optimizing queries that use variables.

If you are sure that you will always benefit from using the index, just put a hint.

If you will put the literal values to the query instead of variables, it will pick the correct statistics and will use the index.

You may also try to put a more light hint:

OPTION (OPTIMIZE FOR (@col1 = 1, @col2 = 0, @col3 = '2009-07-09'))

, which will calculate the best execution plan for these values of the variables, using statistics, and won't stick to using index no matter what.

Quassnoi 2009-07-09 19:58:32

If I run the query outside of the SP, with the column values hard-coded it still uses a clustered index scan :(

Tom H. 2009-07-10 13:59:54

@Tom: could you please post your exact table definition?

Quassnoi 2009-07-10 14:13:03

Answer 3

+1 A:

Have you tried tossing out the bit from the index?

create index ix1 on My_Table(Col3, Col1) INCLUDE(Col2) 
-- include other columns from the select list if needed

Also, you've left out the rest of the columns from the select list. You might want to consider including those if there aren't many either in the index or as INCLUDE statement to create a covering index for the query.

Scott Ivey 2009-07-09 20:00:32

Answer 4

+1 A:

Try masking your parameters to prevent paramter sniffing:

CREATE PROCEDURE MyProc AS
 @Col1 INT
 -- etc...
AS
 DECLARE @MaskedCol1 INT
 SET @MaskedCol1 = @Col1
 -- etc...

 SELECT
   col1,
   col2…
 FROM
   dbo.My_Table
 WHERE
   col1 = @MaskecCol1 AND
   -- etc...

Sounds stupid but I've seen SQL server do some weird things because of parameter sniffing.

Kragen 2009-07-09 21:13:59

Thanks for the suggestion. I can run the SELECT outside of the SP though and I still see the same situation.

Tom H. 2009-07-10 13:45:39

Answer 5

+1 A:

I bet SQL Server thinks the price of getting the rest of the columns (designated by ... in your example) from the clustered index outweighs the benefit of the index so it just scans the clustered key. If so, see if you can make this a covering index.

Or does it use another index instead?

zvolkov 2009-07-09 21:20:05

It's using the clustered key if I don't force the index use. The column list includes all of the columns in the table. While I could put a big covering index on that, I'd be effectively duplicating the table. I'll need to look into the INSERT/UPDATE/DELETE frequencies to see if the cost is warranted.

Tom H. 2009-07-10 13:55:04

Answer 6

A:

Are the columns nullable? Sometimes Sql Server thinks it has to scan the table to find NULL values.

Try adding "and col1 is not null" to the query, it mgiht make sqlserver use the index wtihout hint.

Also, check if the statistics are really up to date:

SELECT 
    object_name = Object_Name(ind.object_id),
    IndexName = ind.name,
    StatisticsDate = STATS_DATE(ind.object_id, ind.index_id)
FROM SYS.INDEXES ind
order by STATS_DATE(ind.object_id, ind.index_id) desc

Andomar 2009-07-09 21:44:33

Unlike Oracle, SQL Server indexes NULLs just as well and an index always covers all rows.

Quassnoi 2009-07-09 22:39:58

My theory was that sometimes the statistics show there are many rows with NULLs. Then Sql Server reverts to a table scan to cover the case "c1 is null".

Andomar 2009-07-10 10:48:35

@Andomar: this query can never cover rows for which `COL1` is `NULL`

Quassnoi 2009-07-10 14:15:22

Answer 7

+6 A:

You have 800k rows indexed by col1, col2, col3. Col2 is a bit, so its selectivity is 50%. Col3 is a checked on a range (<=), so it's selectivity will be roughly at about 50% too. Which leaves col1. The query is compiled for the generic, parametrized plan, so it has to account for the general case. If you have 10 distinct values of col1, then your index will return approximately 800k /10 * 25% that is about ~20k keys to lookup in the clustered index to retrieve the '...' part. If you have 10k distinct col1 values then the index will return just 20 keys to look up. As you can see, what matters is not how you build your index in this case, but the actual data. Based on the selectivity of col1, the optimizer will choose a plan based on a clustered index scan (as better than 20k key lookups, each lookup at a cost of at least 3-5 page reads) or one based on the non-clustered index (if col1 is selective enough). In real life the distribution of col1 also plays a role, but going into that would complicate the explanation too much.

You can come with the benefit of hindsight and claim the plan is wrong, but the plan is the best cost estimate based on the data available at compile time. You can influence it with hints (index hint as you suggests, or optimize for hints as Quassnoi suggests) but then your query may perform better for your test set, and far worse for a different set of data, say for the case when @col1 = <the value that matches 500k records>. You can also make the index covering, thus eliminating the '...' in the projection list that require the clustered index lookup necessary, in which case the non-clustered index is always a better cost match than the clustered scan.

Kimberley Tripp has a blog article covering this subject, she calls it the 'index tipping point' which explains how come an apparently perfect candidate index is being ignored: a non-clustered index that does not cover the projection list and has poor selectivity will be seen as more costly than a clustered scan.

Remus Rusanu 2009-07-09 21:49:57

Thanks for the information and suggestions. It gives me some ideas to look into.

Tom H. 2009-07-10 13:52:59

Answer 8

A:

If your SELECT is returning columns that aren't in your index SQL my find that its more efficient to scan the clustered index instead of having to do a key lookup to find the other values that you are requesting.

If you have a TEXT column try switching the data type to VARCHAR(MAX) then including the values in the nonclustered index.

mrdenny 2009-07-12 21:51:13

ansaurus

tags:

views:

answers:

SQL Server won't use my index

related questions