views:

57

answers:

3

Hello,

I have a relatively complex query, with several self joins, which works on a rather large table. For that query to perform faster, I thus need to only work with a subset of the data. Said subset of data can range between 12 000 and 120 000 rows depending on the parameters passed.

More details can be found here: http://stackoverflow.com/questions/3054843/sql-server-cte-referred-in-self-joins-slow

As you can see, I was using a CTE to return the data subset before, which caused some performance problems as SQL Server was re-running the Select statement in the CTE for every join instead of simply being run once and reusing its data set.

The alternative, using temporary tables worked much faster (while testing the query in a separate window outside the UDF body). However, when I tried to implement this in a multi-statement UDF, I was harshly reminded by SQL Server that multi-statement UDFs do not support temporary tables for some reason...

UDFs do allow table variables however, so I tried that, but the performance is absolutely horrible as it takes 1m40 for my query to complete whereas the CTE version only took 40 seconds. I believe the table variables is slow for reasons listed in this thread: http://stackoverflow.com/questions/1643687/table-variable-poor-performance-on-insert-in-sql-server-stored-procedure

Temporary table version takes around 1 seconds, but I can't make it into a function due to the SQL Server restrictions, and I have to return a table back to the caller.

Considering that CTE and table variables are both too slow, and that temporary tables are rejected in UDFs, What are my options in order for my UDF to perform quickly?

Thanks a lot in advance.

A: 

One kludgey work-around I've used involves code like so (psuedo code follows):

CREATE TEMP TABLE #foo

EXECUTE MyStoredProcedure

SELECT *
 from #foo

GO

--  Stored procedure definition
CREATE PROCEDURE MyStoredProcedure
AS

INSERT #foo values (whatever)

RETURN
GO

In short, the stored procedure references and uses a temp table created by the calling procedure (or routine). This will work, but it can be confusing for others to follow what's going on if you don't document it clearly, and you will get recompiles, statistics recalcs, and other oddness that may consume unwanted clock cycles.

Philip Kelley
Thanks, I've tried something like this just now as well... by making my UDF into a stored procedure and then, creating another UDF and doing something like INSERT INTO @variableTable EXEC mySp(@Param1, @Param2, @Param3)in order to get a table back from the UDF... but the problem is that INSERT EXEC is not allowed inside functions.Error = "Invalid use of a side-effecting operator 'INSERT EXEC' within a function."
Kharlos Dominguez
Check out @KM's suggestion. They can be tricky to implement properly, but it may do what you need.
Philip Kelley
+1  A: 

In many such cases all we need to do is to declare primary keys for those table variables, and it is fast again.

AlexKuznetsov
Thanks a lot: you saved me. It has solved my problem. It only takes a few seconds now.Any idea why the temp tables do not require that though, and are most of the time faster?
Kharlos Dominguez