views:

70

answers:

3

I have a table, let's call it History. The primary key (aka Clustered Index) is called HIST_ID. The table has some 2300 rows in the development DB. Now consider the following two queries:

Query 1:

declare @x int
set @x = 14289

select * from History where hist_id=@x

Query 2:

declare @x int
set @x = 14289

select * from History where hist_id=@x or @x is null

The only difference is the or @x is null at the end. However the first query does an index seek, the second - index scan. What gives?

Pre-emptive reply - no, option(recompile) doesn't help.

Added: I'd like some solid argumented facts, not guesses. I can guess a dozen possible reasons for this myself. But what is the real problem here?

A: 

I guess the optimizer determines it is beneficial. The alternative would be to use the same plan as if you had written

select * from History where hist_id=@x
union all
select * from History where @x is null

You can rewrite the query in this way, but I'm pretty sure the optimizer is capable of doing this by itself. How many null values do you have?

Edit: Turns out I misread the question and thought you wanted WHERE (@x = hist_id OR hist_id is null). In fact you want a dynamic criterion. Check out this article. Note that this your kind of query was supposed to work in SQL2k8 if you specify WITH(RECOMPILE), but due to a nasty bug this support was removed.

erikkallen
Ahem, look carefully! The "is null" is done against @x, not the table! It's a constant FALSE expression!
Vilx-
The second SELECT statement in his union will **always** return an empty set. The value of @x is 14289, not null.
Vilx-
The support was added back after Cumulative Update 5 was released. If you get the latest update then it works as advertised.
Chris Lively
+1  A: 

I would suggest that the plan is being produced separate to the parameter being passed in / used, so in essence there is a requirement (depending on the value of @x) to return every row. As such the query plan is dealing with it's worst case scenario of the parameters it can receive.

e.g. If the input for @x was null, then the query would be forced to return every row since every row would satisfy a literal equation / predicate that always returned true. For the query plan to cover every value of @x, it must generate a plan that performs a scan.

Andrew
I've read that MSSQL can do something called "parameter sniffing" which should fix this. Option(recompile) in fact should do just that - but it doesn't.
Vilx-
I'm sure there's a more technical description of this concept that uses the term SARG, but this version is a bit more intelligible.
Philip Kelley
The parameter sniffing feature is trying to turn literal values in a query into parameters, so there are more query plan cache hits. You are already passing in a parameter, not a literal, so sniffing it not the issue / thus a recompile is not going to alter it.
Andrew
Sometimes execution plans can differ immensely based on the supplied parameter values. Like in this case. The optimizer should know this and take the values into account. In the worst case I can specify option(recompile), which, according to the documentation "uses the current values of any local variables in the query".
Vilx-
It does not matter to recompile, eliminate the parameter so there is no sniffing, no recompile, use the literal statement, select * from history where hist_id = 1 or 1 is null - this gives you a scan. The recompile is immaterial, when it has all the information handed to it, the query plan is not short circuiting the logic for you, as much as we know it should be able to, the engine is just not going to take that step.The real 'problem' you are looking for / at is logic short circuits and SQL is not as advanced at them as we might all prefer.
Andrew
A: 

Of course it's an index scan.

A clustered index scan = table scan because you have no sensible predicate for "@x IS NULL".

The parameterised, cached plan does is general and will work for @x = NULL or @x = value. If you don't define @x, you should get the same plan.

If you coded "12345 IS NULL" then this is detected and ignored.

I can't find an blog article on how constants are treated in query plans. The gist is that they are generalised and short circuits don't happen to allow plan reuse.

gbn