ansaurus

Question

SQL Index Seek

Answer 1

+1 A:

It depends on your query. If you are only querying for these two values, e.g.:

SELECT StartNum, EndNum
FROM Books
WHERE (@Num >= StartNum) AND (@Num <= EndNum)

In this case, SQL Server only has to seek through the two indexes to return the numbers. Your two indexes contain the value you're indexed, and because it's an index they're sorted.

But i'm sure you're actually including other columns in your query:

SELECT BookID, Title, IDBN, Author, StartNum, EndNum
FROM Books
WHERE (@Num >= StartNum) AND (@Num <= EndNum)

In this case, SQL Server, once it has found the id's of the rows that match the criteria, must then go back into the database and find those rows so that it can return you the:

Title
ISBN
Author

in addition to the values it already has from the two indexes:

StartNum
EndNum
BookID

Note: An index on StartNum implicily contains the value of the clustered key, since that's how it knows what row corresponds to an entry in the index.

The problem is that if there are "too many books" that it has to go look up in the table, then it might just be faster to read the entire table top to bottom.

Analogy: You look in the index of a book for all references to "design patterns".

design patterns: 4, 89, 221, 442

If there's only 4 entries, then you'd be okay with flipping backwards to the page listed in the index. This is called a bookmark lookup.

But what if the index said there was 827 references to a phrase?

computer: 1, 2, 6, [snip 825 entries], 1087, 1128

Then it might just be faster to read through the book to find them yourself. In this case we give up and scan the entire book. SQL Server calls this a clustered index scan, if the table has a clustered index, or a table scan if there is no clustered index (i.e. it's a "heap table")

If you were simply referencing StartNum and EndNum in your query (let say you were getting a count), then i'm sure you'll see consistently low reads and execution times. But if you include other columns, and SQL Server thinks that there is too many rows that will be returned from that query (more than 5% of the table, for example), then it is just going to forget the index and scan the entire table.

There is a crossover point, where SQL Server knows the distribution of StartNum and EndNum values in your table, because it samples the values and has statistics on their distribution. If some values of @Num will happen to return few rows, and SQL Server knows this, it will perform the relativly few bookmark lookups. But if your data distribution would cause more rows to return, then you'll get the clustered index scan.

Ian Boyd 2009-06-05 15:24:57

Thanks, Ian, but it really is as simple as I said. I started with more going on so I stripped it all back to just selecting the columns from the index. There are no bookmark lookups or anything else. Just an Index Seek on my index. Still getting widely varying Reads and Duration depending on how far down the 'list' the sought number falls.

David Wimbush 2009-06-05 15:33:37

Answer 2

A:

Are you indexed on StartNum ASC, EndNum DESC?

Because I expect it's not able to do two seeks if you are indexed on StartNum ASC, EndNum ASC. It has to do a seek and a scan.

Another possibility is to index separately on StartNum and EndNum (order wouldn't matter) and see if it will use both indexes in sequence and then do some kind of join.

You can't really tell until you look at the differing execution plans.

I'm having trouble duplicating the behavior without the real columns. I set up a test table with 200000 StartNum-EndNum ranges 100-199, 200-299, etc. I've been able to get it to do a clustered index scan, table scan, and a non-clustered index seek, but nothing close to what you are talking about.

Cade Roux 2009-06-06 01:34:18

Thanks for that, Cade. The index is StartNum asc, EndNum asc and it's literally just a single index seek in the query plan. I tried asc, desc but that didn't help. I tried separate indexes and then it used a seek on the StartNum index combined with a table scan to find those <= EndNum.

David Wimbush 2009-06-08 07:17:09

Answer 3

A:

Cracked it!

EndNum is always >= StartNum so I just need to find the biggest StartNum that is <= @Num. I changed the index to StartNum desc and the query to top 1 ... where StartNum <= @Num order by StartNum desc. Now for any value of @Num it's just an index seek with 5 reads and 0 duration, which is just what I was after.

Thanks for the help, guys.

David Wimbush 2009-06-09 09:28:10

ansaurus

tags:

views:

answers:

SQL Index Seek

related questions