views:

76

answers:

4

We're seeing strange behavior when running two versions of a query on SQL Server 2005:

version A:

SELECT otherattributes.* FROM listcontacts JOIN otherattributes
ON listcontacts.contactId = otherattributes.contactId WHERE listcontacts.listid = 1234
ORDER BY name ASC

version B:

DECLARE @Id AS INT;
SET @Id = 1234;
SELECT otherattributes.* FROM listcontacts JOIN otherattributes
ON listcontacts.contactId = otherattributes.contactId  
WHERE listcontacts.listid = @Id
ORDER BY name ASC

Both queries return 1000 rows; version A takes on average 15s; version B on average takes 4s. Could anyone help us understand the difference in execution times of these two versions of SQL?

If we invoke this query via named parameters using NHibernate, we see the following query via SQL Server profiler:

EXEC sp_executesql  N'SELECT otherattributes.* FROM listcontacts JOIN otherattributes ON listcontacts.contactId = otherattributes.contactId WHERE listcontacts.listid = @id ORDER BY name ASC',
     N'@id INT',
     @id=1234;

...and this tends to perform as badly as version A.

+2  A: 

Try take a look at the execution plan for your query. This should give you some more explanation on how your query is executed.

Snake
A: 

It's possible that instead of casting 1234 to be the same type as listcontacts.listid and then doing the comparison with each row, it might be casting the value in each row to be the same as 1234. The first requires just one cast, the second needs a cast per row (and that's probably on far more than 1000 rows, it may be for every row in the table). I'm not sure what type that constant will be interpreted as but it may be 'numeric' rather than 'int'.

If this is the cause, the second version is faster because it's forcing 1234 to be interpreted as an int and thus removing the need to cast the value in every row.

However, as the previous poster suggests, the query plan shown in SQL Server Management Studio may indicate an alternative explanation.

Daniel Renshaw
A: 

The best way to see what is happening is to compare the execution plans, everything else is speculation based on the limited details presented in the question.

To see the execution plan, go into SQL Server Management Studio and run SET SHOWPLAN_XML ON then run query version A, the query will not run but the execution plan will be displayed in XML. Then run query version B and see its execution plan. If you still can't tell the difference or solve the problem, post both execution plans and someone here will explain it.

KM
+1  A: 

I've not seen the execution plans, but I strongly suspect that they are different in these two cases. The issue that you are having is that in case A (the faster query) the optimiser knows the value that you are using for the list id (1234) and using a combination of the distribution statistics and the indexes chooses an optimal plan.

In the second case, the optimiser is not able to sniff the value of the ID and so produces a plan that would be acceptable for any passed in list id. And where I say acceptable I do not mean optimal.

So what can you do to improve the scenario? There are a couple of alternatives here:

1) Create a stored procedure to perform the query as below:

CREATE PROCEDURE Foo @Id INT AS SELECT otherattributes.* FROM listcontacts JOIN otherattributes ON listcontacts.contactId = otherattributes.contactId WHERE listcontacts.listid = @Id ORDER BY name ASC

GO

This will allow the optimiser to sniff the value of the input parameter when passed in and produce an appropriate execution plan for the first execution. Unfortunately it will cache that plan for reuse later so unless the you generally call the sproc with similarly selective values this may not help you too much

2) Create a stored procedure as above, but specify it to be WITH RECOMPILE. This will ensure that the stored procedure is recompiled each time it is executed and hence produce a new plan optimised for this input value

3) Add OPTION (RECOMPILE) to the end of the SQL Statement. Forces recompilation of this statement, and is able to optimise for the input value

4) Add OPTION (OPTIMIZE FOR (@Id = 1234)) to the end of the SQL statement. This will cause the plan that gets cached to be optimised for this specific input value. Great if this is a highly common value, or most common values are similarly selective, but not so great if the distribution of selectivity is more widely spread.

Paul McLoughlin