ansaurus

Question

Answer 1

A:

The first thing I would do is specify the exact columns I want, and see if the problems persists. I doubt you would need all the columns from all three tables.

It sounds like it has trouble getting all the rows out of the result table. How big is a row? Look at how big all the data in the table is and divide it by the number of rows. Right click on the table -> properties..., Storage tab.

Try putting where clause into a subquery to force it to do that first?

SELECT *
FROM 
    (SELECT * FROM sample 
     WHERE sampled_date 
     BETWEEN '2010-03-17 09:00' AND '2010-03-17 12:00') s
     INNER JOIN test ON s.sample_number = test.sample_number
     INNER JOIN result ON test.test_number = result.test_number

OR this might work better if you expect a small number of samples

SELECT *
FROM sample
   INNER JOIN test ON sample.sample_number = test.sample_number
   INNER JOIN result ON test.test_number = result.test_number
WHERE sample.sample_ID in (
    SELECT sample_ID
    FROM sample
    WHERE sampled_date BETWEEN '2010-03-17 09:00' AND '2010-03-17 12:00'
)

Robert Wagner 2010-03-17 23:40:02

1. it's already covered with index. troubles in RESULT table, not in SAMPLE2. 800 vs 40 (1st and 2nd query accordingly). if i specify any field different to RESULT.TEST_NUMBER (SELECT RESULT.NAME FROM ...) for example - i get a slow query and plan with clustered index scan

zerkms 2010-03-17 23:45:56

this query get the same results (slow too). but this is slow even for 10-12 hrs interval.

zerkms 2010-03-18 00:02:27

Check the size of the row, see above. Also look to see if there is large amounts of data in the 9am-10am timeframe.

Robert Wagner 2010-03-18 00:06:13

the last query will not help just because as i said trouble is in RESULT, not in a SAMPLE

zerkms 2010-03-18 00:08:59

Answer 2

A:

If you do a SELECT *, you want all the data from the table. The data for the table is in the clustered index - the leaf nodes of the clustered index are the data pages.

So if you want all of those data pages anyway, and since you're joining 1 mio. rows to 11 mio. rows (1 out of 11 isn't very selective for SQL Server), using an index to find the rows, and then do bookmark lookups into the actual data pages for each of those rows found, might just not be very efficient, and thus SQL Server uses the clustered index scan instead.

So to make a long story short: only select those rows you really need! You thus give SQL Server a chance to use an index, do a seek there, and find the necessary data.

If you only select three, four columns, then the chances that SQL Server will find and use an index that contains those columns are just so much higher than if you ask for all the data from all the tables involved.

Another option would be to try and find a way to express a subquery, using e.g. a Common Table Expression, that would grab data from the two smaller tables, and reduce that number of rows even more, and join the hopefully quite small result against the main table. If you have a small result set of only 40 or 800 results (rather than two tables with 1 mio. rows each), then SQL Server might be more inclined to use a Clustered Index Seek and do bookmark lookups on 40 or 800 rows, rather than doing a full Clustered Index Scan.

marc_s 2010-03-18 06:16:05

"So if you want all of those data pages anyway, and since you're joining 1 mio. rows to 11 mio. rows" I'm joining about 20 rows (date range limits this) to the 11M rows with index. anyway this doesn't explains why changing range a little changes plan. why sql can use index for 20 rows and cannot to 800?

zerkms 2010-03-18 06:39:27

"So to make a long story short: only select those rows you really need! You thus give SQL Server a chance to use an index, do a seek there, and find the necessary data." So if I enumerate all 20 fields that i need from RESULT table what changes should be done with schema?

zerkms 2010-03-18 06:40:40

@zerkms: yes, the difference between selecting 20 or 800 rows can make a **huge** difference in how the query optimizer picks its plan! Maybe, up to 150 rows, doing the bookmark lookups will be faster than the clustered index scan, and then the scale tips. Also having up to date statistics can make a huge difference - try to update your statistics on the tables and indices involved as a first measure

marc_s 2010-03-18 07:19:00

yep, scan on 11M is of course faster than index seeking on 800 rows... (ironic)... as i said at UPD3 yeah - i need some statistics (or something service command else) updating. but cannot google it :-(

zerkms 2010-03-18 07:47:13

Answer 3

+1 A:

A couple things to try:

Update statistics
Add hints to the query about what index to use (in SQL Server you might say WITH (INDEX(myindex)) after specifying a table)

EDIT: You noted that copying the database made it work, which tells me that the index statistics were out of date. You can update them with something like UPDATE STATISTICS mytable on a regular basis.

Use EXEC sp_updatestats to update the whole database.

Gabe 2010-03-18 06:45:16

yep, according to UPD3 at question: some command i need in to solve this ;-)

zerkms 2010-03-18 06:51:53

yep, i will try this tomorrow when come to work. thank you. and may be any same methods to update stats on whole database?

zerkms 2010-03-18 07:48:42

ansaurus

tags:

views:

answers:

Slow query with unexpected index scan

related questions