views:

1094

answers:

4

I have a rather huge query that is needed in several stored procedures, and I'd like to shift it into a UDF to make it easier to maintain (A view won't work, this takes in a bunch of parameters), however everyone I've ever talked to has told me that UDF's are incredibly slow.

While I don't know what exactly makes them slow, I'm will to guess that they are, but seeing as I'm not using this UDF within a join, but instead to return a table variable, I think it wouldn't be that bad.

So I guess the question is, should I avoid UDF's at all cost? Can anyone point to concrete evidence stating that they are slower?

A: 

Is there some reason you don't want to use a stored procedure instead of a UDF?

MasterMax1313
Mainly because you can't do that in SQLServer, you have to insert the return of a sproc into a temp table.
FlySwat
Sorry, can't do what in SQL Server? A stored procedure can call another stored procedure.
John Saunders
Can't do SELECT * FROM EXEC Sproc, like you can in FireBird. You have to instead create a temp table, and select into it. I'd like to avoid that.
FlySwat
But, you don't need to do SELECT * FROM EXEC Sproc. Just do EXEC SPROC.
John Saunders
No, this query is further refined depended on the usage (results paged etc)
FlySwat
@Flyswat: good information you might have included to begin with ...
John Saunders
+3  A: 

Scalar UDFs are very slow, inline UDFs are in fact macros, as such they are very fast: A few articles:

Reuse Your Code with Table-Valued UDFs

Many nested inline UDFs are very fast

More links on slowness of scalar UDFs:

SQL Server Performance patterns of a UDF with datetime parameters

Not all UDFs are bad for performance

AlexKuznetsov
I'm using a multi statement return UDF, not a single table select, due the complexity of the query...Do you think this is still quick?
FlySwat
In my experience nested inline UDFs can reduce complexity very well. Multi-statement ones are usually (not always) somewhat slower. Only benchmarking can show how slower in your particular case. I would try nested inline UDFs first.
AlexKuznetsov
@AlexKuznetsov: could you please post a link explaining how and why a scalar UDF is slow?
John Saunders
@ John Saunders: Links added. First, third, and fourth links explain it.
AlexKuznetsov
@AlexKuznetsov: still not sure. Could you check the "big comment" I added to my answer and let me know if you feel there's a difference between the two SELECTs?
John Saunders
The question is "Why avoid Table-Valued User Defined Functions?", not "Why avoid scalar-valued user defined functions". Maybe I miss something when I read through everybody's comments but someone answer the question.
An Phu
+2  A: 

Hi, as you pointed out that the results of the (table) udf will not be joined to anything then there shoud not be any impact on performance.

To try to explain a little about why UDFs can be perceived as slow (in fact just used in the wrong way) consider the following exmaple;

We have table A and Table B. Say we had a join like

SELECT A.col1, A.col2, B.ColWhatever FROM A JOIN B ON A.aid = b.fk_aid WHERE B.someCol = @param1 AND A.anotherCol = @param2

In this case, SQL Server will do it's best to return the results in the most performant way it knows how. A major factor in this is reducing the disk reads. So - it will use the conditions in the JOIN and where clause to evaluate (hopefully with an index) how many rows to return.

Now - say we extract some part of the conditions used to restirct the amount of data returned to a UDF. Now - the query optimizer can no longer pull back the minimum amount of rows from the disk, it can only deal with the conditions it provides. In a nutshell - a table udf is always evaluated and the data is returned before being returned to the main sproc, so, if there were some other criteria in the original join that could have caused fewer disk reads - this will only be applied to data after being pulled into the sproc.

So say we create a UDF to select the rows from table B that match the where clause. If there are 100k rows in table B and 50% of them meet the criteria of the where clause - then all these rows will be returned to the sproc for comparison with table A. Now if only 10% of those have matches in table A now we are only talking 5% of table B that we want to work with, but we have already pulled back 50%, the majority of which we do not want!

If this comes across as complete gibberish apologies - please let me know!

Ok, so you're saying don't use scalar UDFs in the WHERE clause of a JOIN. Any other situations where they slow things down? Also, to be precise, in this case you seem to be saying that it's not that the UDF itself is slow, but rather that, in this case, it prevents the optimizer from optimizing, which slows things down.
John Saunders
A: 

Could you post your code? Generally speaking if you are using a scalar udf in the select clause of a query, the statements within the udf will be executed once per row returned from the query. It would be better to perform a join to a table valued udf, or find some way to perform the logic within your udf using a join in your main SQL statement.

jn29098
I'm not using this in a join. Its a UDF that returns a table value.
FlySwat