views:

85

answers:

7

I have a user defined function (e.g. myUDF(a,b)) that returns an integer.

I am trying to ensure this function will be called only once and its results can be used as a condition in the WHERE clause:

SELECT col1, col2, col3, 
       myUDF(col1,col2) AS X
From myTable
WHERE x>0

SQL Server tries to detect x as column, but it's really an alias for a computed value.

How can you re-write this query so that the filtering can be done on the computed value without having to execute the UDF more than once?

+4  A: 

try

SELECT col1, col2, col3, dbo.myUDF(col1,col2) AS X 
From myTable 
WHERE dbo.myUDF(col1,col2) >0

but be aware that this will cause a scan since it is not SARGable

Here is another way

select * from(
SELECT col1, col2, col3, dbo.myUDF(col1,col2) AS X 
From myTable ) as  y 
WHERE x>0
SQLMenace
yeah but this function is so time consuming and I am trying to avoid it using twice
Gzim
Well the first case ought really only execute it once, and the second one - the common table expression way (or Baaju's answer) ought definitely only compute it once
Rup
Just because you typed in the function twice does not mean it will be called twice. SQL Server will likely recognize that it's the same function in both places and only evaluate it once.
Gabe
+1  A: 

SQL Server does not allow you to reference columns by alias. You either have to write out the column twice:

SELECT  col1, col2, col3, myUDF(col1,col2) AS X 
From    table myTable 
WHERE   myUDF(col1,col2) > 0

Or use a subquery:

SELECT  *
FROM    (
        SELECT col1, col2, col3, myUDF(col1,col2) AS X 
        From table myTable 
        ) as subq
WHERE   x > 0
Andomar
+1  A: 

If you are using SQL Server 2005 and beyond, you can use Cross Apply:

Select T.col1, T.col2, FuncResult.X
From Table As T
    Cross Apply ( Select myUdf(T.col1, T.col2) As X ) As FuncResult
Where FuncResult.X > 0
Thomas
What exactly Cross Apply is?
Gzim
@Gzim - Think of Cross Apply as a mix between a derived table and a correlated subquery. In effect, it allows you to join to a subquery that references items outside the query. In my example, I'm referencing col1 and col2 from another table in the From clause. Here's another article on the subject: http://www.sqlteam.com/article/using-cross-apply-in-sql-server-2005
Thomas
Oh boy this is so powerful. Especially when you have complicated cases when inserting and selecting from a table (at the same time) and also from a table UDF which accepts parameters from the selected table. Thanx Thomas.
Gzim
A: 

I'm not 100% sure what you are doing but since x isn't a column I would remove it from your SQL statement so you have :

SELECT col1, col2, col3, myUDF(col1,col2) AS X From myTable

And then add the condition to your code so you only call it when x > 0

Kyra
+6  A: 
With Tbl AS 
(SELECT col1, col2, col3, myUDF(col1,col2) AS X  
        From table myTable  )

SELECT * FROM Tbl WHERE X > 0
Baaju
Excellent answer for SQL Server 2005+ !
p.campbell
+1  A: 

Depending on the udf and how useful or frequently used it is, you may consider adding it to the table as a computed column. You could then filter on the column as normal and not have to write out the function at all in queries.

K Richard
A: 

Your question is best answered by the "With" clauses (CTE's I think, in MSSS).

Really the best question is: Should I store this computed value or recalculate it for every row, each and every time I query the table.

Are there 10 rows in the table and always 10 rows?

Are rows being added constantly?

Do you have a purge strategy in place or just let it grow?

Query that table only once a month?

If this is a "long running" function (even after you've optimized the hell out of it), why do you want to execute it more than once, ever?

You asked for once, but you are really asking for once per row, per query.

Storing the answer in an index or "virtual column"

Pros:

Calculate exactly once per row. Query times don't grow linearly.

Cons: Increases insert/update time

Calculating every time

Pros:

Insert/update time optimized

Cons: Query time grows with row count. (not scalable)

If you're querying once a month, why do you care how bad the performance is, go tune something that actually has a big impact on your operations (very slightly facetious).

If you're not inserting a bunch (depends on your hardware) of rows per second, is spending that time up front going to make a big difference?

Stephanie Page