views:

297

answers:

5

So I have this weird problem with an SQL Server stored procedure. Basically I have this long and complex procedure. Something like this:

SELECT Table1.col1, Table2.col2, col3
FROM Table1 INNER JOIN Table2
     Table2 INNER JOIN Table3
     -----------------------
     -----------------------
     (Lots more joins)
WHERE Table1.Col1 = dbo.fnGetSomeID() AND (More checks)
     -----------------------
     -----------------------
(6-7 More queries like this with the same check)

The problem is that check in the WHERE clause at the end Table1.Col1 = dbo.fnGetSomeID(). The function dbo.fnGetSomeID() returns a simple integer value 1. So when I hardcode the value 1 where the function call should be the SP takes only about 15 seconds. BUT when I replace it with that function call in the WHERE clause it takes around 3.5 minutes.

So I do this:

DECLARE @SomeValue INT
SET @SomeValue = dbo.fnGetSomeID()
--Where clause changed
WHERE Table1.Col1 = @SomeValue

So now the function is only called once. But still the same 3.5 minutes. So I go ahead and do this:

DECLARE @SomeValue INT
--Removed the function, replaced it with 1
SET @SomeValue = 1
--Where clause changed
WHERE Table1.Col1 = @SomeValue

And still it takes 3.5 minutes. Why the performance impact? And how to make it go away?

A: 

Another thing to try. Instead of loading the id into a variable, load it into a table

if object_id('myTable') is not null drop myTable
select dbo.fnGetSomeID() as myID into myTable

and then use

WHERE Table1.Col1 = (select myID from myTable)

in your query.

cindi
+1  A: 

As is mentioned elsewhere, there will be execution plan differences depending on which approach you take. I'd look at both execution plans to see if there's an obvious answer there.

This question described a similar problem, and the answer in that case turned out to involve connection settings.

I've also run into almost the exact same problem as this myself, and what I found out in that case was that using the newer constructs (analytic functions in SQL 2008) was apparently confusing the optimizer. This may not be the case for you, as you're using SQL 2005, but something similar might be going on depending on the rest of your query.

One other thing to look at is whether you have a biased distribution of values for Table1.Col1 -- if the optimizer is using a general execution plan when you use the function or the variable rather than the constant, that might lead it to choose suboptimal joins than when it can clearly see that the value is one specific constant.

If all else fails, and this query is not inside another UDF, you can precalculate the fnGetSomeID() UDF's value like you were doing, then wrap the whole query in dynamic SQL, providing the value as a constant in the SQL string. That should give you the faster performance, at the cost of recompiling the query every time (which should be a good trade in this case).

mwigdahl
+2  A: 

Even with @SomeValue set at 1, when you have

WHERE Table1.Col1 = @SomeValue

SQL Server probably still views @SomeValue as a variable, not as a hardcoded 1, and that would affect the query plan accordingly. And since Table1 is linked to Table2, and Table2 is linked to Table3, etc., the amount of time to run the query is magnified. On the other hand, when you have

WHERE Table1.Col1 = 1

The query plan gets locked in with Table1.Col1 at a constant value of 1. Just because we see

WHERE Table1.Col1 = @SomeValue

as 'hardcoding', doesn't mean SQL sees it the same way. Every possible cartesian product is a candidate and @SomeValue needs to be evaluated for each. So, the standard recommendations apply - check your execution plan, rewrite the query if needed.

Also, are those join columns indexed?

Frank Rustyak
Yes they're indexed.
Daud
A: 

You could try the OPTIMIZE FOR hint to force a plan for a given constant, but it may have inconsistent results; in 2008 you can use OPTIMIZE FOR UNKNOWN

SqlACID
A: 

I think that since the optimizer has no idea how much work the function does, it tries to evaluate them last.

I would try storing the return value of the function in a variable ahead of time, and using that in your where clause.

Also, you might want to try schema binding your function, because apparently sometimes it seriously affects peformance.

You can make your function schema bound like so:

create function fnGetSomeID()
with schema_binding
returns int
... etc.
John Gibb