ansaurus

Question

Why does SQL choose an incorrect index in my case?

Answer 1

A:

Updating the statistics on the table / indexes may make it choose the correct index

ck 2009-05-18 07:42:26

I will test it, but This means I have to force sql to use the correct index not to rely on its selection?

Ahmed Said 2009-05-18 07:53:31

No if your statistics are correct and your indexes and queries designed properly, then you shouldn't need to force certain indexes in your query.

ck 2009-05-18 11:35:40

Answer 2

A:

Use symbolid BETWEEN 1010 AND 1050 if possible. The use of BETWEEN or = or >= or > or <n or <= or the combination of these with AND generally leads to better performance and better index selection than the use of OR or IN.

pts 2009-05-18 07:44:51

Answer 3

A:

It is possible the order of index column affects whether the optimiser will choose your index. You indicate the index is (symbolid int16,bartime int32,typeid int8) but the symbolid is the least distinct value in your where clauses. This would require 6 index lookups for the 6 values you have.

I would probably start with the between statement but only testing with your data, server, indexes etc will prove the best case.

If you are going to create another index try the 2 other orders for those columns.

And as noted elsewhere update your statistics

Karl 2009-05-18 07:48:14

Choose the most restrictive column (fewest rows output) first in your where-clause.

Scoregraphic 2009-05-18 07:52:56

Answer 4

A:

You can also try out a covering index on (symbolid,bartime,typeid,mvTrdBuy)

AlexKuznetsov 2009-05-19 13:26:00

okay but this may decrease the performance

Ahmed Said 2009-05-20 09:24:48

I mean the insertion performance

Ahmed Said 2009-05-20 09:25:04

Answer 5

+1 A:

Your query references four columns:

symbolID
vTrdBuy
typeID
barDateTime

While the clustered index only covers three of them

symbolID
vTrdBuy
typeID
barDateTime

The reason SQL Server ignores that index is that it's useless to it. The index is first sorted by symbolID, and you don't want a specific symbolID, but a bunch of random values. This means that it has to read all over the table.

The next column in the clustered index is vTrdBuy. This isn't used to help it to skip to the rows it actually wants.

Looking at the query, two columns are very specific in limiting what rows you want to return:

WHERE typeID = 1
AND barDateTime = 44991

Creating an index that starts with typeID and barDateTime can really be useful in helping SQL Server jump to the rows that you are interested in.

First SQL Server can jump right to the rows that are

typeID = 1.

Once there, it can jump right to bars where

barDateTime = March 8, 2023

It can do this by seeking right through the index, since the index is ordered by the columns in it. This is very fast, and it's eliminated the majority of rows from being looked at.

If you were to create the index:

(
   typeID
   barDateTime
   symbolID
)

it still might not useful if the query returns a lot of rows. In order to finish the SELECT statement, SQL Server still needs the vTrdBuy value. It has to do this by jumping through the table for each one of the rows that matches the criteria (called a Bookmark Lookup). If there are too many rows (say > 500), SQL Server will just forget the index and just scan the entire table - cause it would be faster.

You want to prevent the bookmark lookup, by letting it not have to go back to the table for the missing value, you want to include the value in the index:

CREATE INDEX IX_mvTrdHidUhd_FancyCovering ON mvTrdHidUhd 
(
   typeID, barDateTime, symbolID, vTrdBuy
)

Now you have an index that contians everything SQL Server wants, in the order that it wants, and you don't have to mess with the physical sort order (i.e. clustering) of the physical table.

Ian Boyd 2009-06-03 16:09:06

By definition, clustered index always covers all queries. It it wrong to state that:"While the clustered index only covers three of them * symbolID * vTrdBuy * typeID * barDateTime"

AlexKuznetsov 2009-06-03 21:04:58

How to you figure that a clustered index always covers all columns?

Ian Boyd 2009-06-04 02:58:24

Answer 6

+1 A:

SELECT  symbolID, vTrdBuy
FROM    mvTrdHidUhd 
WHERE   typeID = 1 
        AND barDateTime = 44991 
        AND symbolid IN (1010,1020,1030,1040,1050,1060)

This condition is not covered by a single contiguous range of your clustered index.

These rows:

1010, 44991, 1
1010, 50000, 1
1020, 44991, 1

will come in order in the index, but your query will select the first and the third one, skipping the second.

SQL Server can use Clustered Index Seek if there is a limited number of predicates, like in your IN case. In this case it uses a number of ranges:

SELECT  symbolID, vTrdBuy
FROM    mvTrdHidUhd 
WHERE   (typeID = 1 
        AND barDateTime = 44991 
        AND symbolid = 1010)
        OR
        (typeID = 1 
        AND barDateTime = 44991 
        AND symbolid = 1010)
        OR …

But in case of a BETWEEN range on symbolid it cannot construct such a limited number of predicates, that's why it reverts to less efficient Clustered Index Scan (which scans on symbolid and just filters the wrong results out).

In this case your nonclustered index performs better.

You could rewrite your query like this:

SELECT  symbolID, vTrdBuy
FROM    (
        SELECT  DISTINCT symbolid
        FROM    mvTrdHidUhd 
        WHERE   symbolid BETWEEN 1010 AND 1050
        ) s
JOIN    mvTrdHidUhd m
ON      m.symbolid = s.symbolid
        AND m.typeID = 1 
        AND m.barDateTime = 44991

, which will use Clustered Index Seek on your table as well, both to build a list of DISTINCT symbolid and to join on this list.

Quassnoi 2009-06-03 16:15:49

You said what i said, but with the examples of data. +1

Ian Boyd 2009-06-03 17:19:20

Thank you for this good explanation, but you did not answer my first question?

Ahmed Said 2009-06-04 07:34:16

@Ahmed: it chooses non-clustered index because it thinks it will be more selective. As for proposal: could you please check that SQL Server proposes exactly same column order?

Quassnoi 2009-06-04 08:25:36

ansaurus

tags:

views:

answers:

Why does SQL choose an incorrect index in my case?

related questions