views:

578

answers:

3

Hi folks,

I'm rewriting some old stored procedure and I've come across an unexpected performance issue when using a function instead of inline code.

The function is very simple as follow:

ALTER FUNCTION [dbo].[GetDateDifferenceInDays] 
(    
@first_date SMALLDATETIME, 
@second_date SMALLDATETIME
)
RETURNS INT 
AS
BEGIN 

RETURN ABS(DATEDIFF(DAY, @first_date, @second_date))

END

So I've got two identical queries, but one uses the function and the other does the calculation in the query itself:

ABS(DATEDIFF(DAY, [mytable].first_date, [mytable].second_date))

Now the query with the inline code runs 3 times faster than the one using the function.

Can you please explain me why?

Thanks,

Giuseppe

+5  A: 

Depending on the usage context, the query optimizer may be able to analyze the inline code and figure out a great index-using query plan, while it doesn't "inline the function" for similarly detailed analysis and so ends up with an inferior query plan when the function is involved. Look at the two query plans, side by side, and you should be able to confirm (or disprove) this hypothesis pretty easily!

Alex Martelli
Thanks for posting. I analysed the two execution plans and they are identical except that the one that does not use scalar UDFs has "parallelism" before executin Nested Loops (3 occurences). I know parallelism improves the execution time as it takes advantage of multiple processors; but shall I assume it is all caused by the lack of parallelism in the execution plan?
Giuseppe R
+6  A: 

What you have is a scalar UDF ( takes 0 to n parameters and returns a scalar value ). Such UDFs typically cause a row-by-row operation of your query, unless called with constant parameters, with exactly the kind of performance degradation that you're experiencing with your query.

See here, here and here for detailed explanations of the peformance pitfalls of using UDFs.

nagul
Thank you for posting. Your last link is a good empirical analysis on this problem but does not explain why it is this behaviour.
Giuseppe R
+2  A: 

Don't use a slow scalar UDF, use a fast inline one. Examples here:

Reuse Your Code with Table-Valued UDFs

Calculating third Wednesday of the month with inline UDFs

Many nested inline UDFs are very fast

The question is very common: it has been asked and answered hundreds of times before, as such it has a few canned answers.

AlexKuznetsov
Whoever downvoted, please provide the reason.
AlexKuznetsov
@Alex: Your post has also been flagged as spam, so my guess would be that someone (not me!) thinks you're spamming links to your blog rather than answering the question.
RichieHindle
If the links answer the question, it doesn't matter if they come from the poster's own blog. Let him get some Google juice. It's not like he's selling magazine subscriptions. It's a good blog; my guess is the flagger didn't even look at it.
Robert Harvey
Thanks for replying and giving links to your posts. (Sorry you were voted down, I've given u my thumb up!) I'm going to test performance with inline UDF. It is still not clear why scalar function are very slow. I did search on the net and all articles refer to performance problems when scalar valued UDF reference to tables (making a further select for every row of the main query). Your post analyses a case scenario similar to mine and proposes an alternative but does not explain the reason why scalar are slow. When should scalar function be used then?
Giuseppe R