In this blog post, I need clarification why SQL server would choose a particular type of scan:
Let’s assume for simplicities sake that col1 is unique and is ever increasing in value, col2 has 1000 distinct values and there are 10,000,000 rows in the table, and that the clustered index consists of col1, and a nonclustered index exists on col2.
Imagine the query execution plan created for the following initially passed parameters: @P1= 1 @P2=99
These values would result in an optimal queryplan for the following statement using the substituted parameters:
Select * from t where col1 > 1 or col2
99 order by col1;
Now, imagine the query execution plan if the initial parameter values were: @P1 = 6,000,000 and @P2 = 550.
As before, an optimal queryplan would be created after substituting the passed parameters:
Select * from t where col1 > 6000000 or col2 > 550 order by col1;
These two identical parameterized SQL Statements would potentially create and cache very different execution plans due to the difference of the initially passed parameter values. However, since SQL Server only caches one execution plan per query, chances are very high that in the first case the query execution plan will utilize a clustered index scan because of the ‘col1 > 1’ parameter substitution. Whereas, in the second case a query execution plan using index seek would most likely be created.
Why would the first query use a clustered index, and a index seek in the second query?