views:

438

answers:

3

I have a table-valued, inline UDF. I want to filter the results of that UDF to get one particular value. When I specify the filter using a constant parameter, everything is great and performance is almost instantaneous. When I specify the filter using a variable parameter, it takes a significantly larger chunk of time, on the order of 500x more logical reads and 20x greater duration.

The execution plan shows that in the variable parameter case the filter is not applied until very late in the process, causing multiple index scans rather than the seeks that are performed in the constant case.

I guess my questions are: Why, since I'm specifying a single filter parameter that is going to be highly selective against an indexed field, does my performance go into the weeds when that parameter is in a variable? Is there anything I can do about this?

Does it have something to do with the analytic function in the query?

Here are my queries:

CREATE FUNCTION fn_test()
RETURNS TABLE
WITH SCHEMABINDING
AS
    RETURN
    SELECT DISTINCT GCN_SEQNO, Drug_package_version_ID
    FROM
    (
     SELECT COALESCE(ndctbla.GCN_SEQNO, ndctblb.GCN_SEQNO) AS GCN_SEQNO,
      dpv.Drug_package_version_ID, ROW_NUMBER() OVER (PARTITION BY dpv.Drug_package_version_id ORDER BY 
       ndctbla.GCN_SEQNO DESC) AS Predicate
     FROM dbo.Drug_Package_Version dpv
      LEFT JOIN dbo.NDC ndctbla ON ndctbla.NDC = dpv.Sp_package_code
      LEFT JOIN dbo.NDC ndctblb ON ndctblb.SPC_NDC = dpv.Sp_package_code
    ) iq
    WHERE Predicate = 1
GO

GRANT SELECT ON fn_test TO public
GO

-- very fast
SELECT GCN_SEQNO
FROM dbo.fn_test()
WHERE Drug_package_version_id = 10000

GO

-- comparatively slow
DECLARE @dpvid int
SET @dpvid = 10000
SELECT GCN_SEQNO
FROM dbo.fn_test()
WHERE Drug_package_version_id = @dpvid
+1  A: 

Once you create a new projection through a UDF, it can't be expected that your indexes will still apply on the columns that are indexed on the original table and included in the projection. When you filter on the projection (and not in the UDF against the original table with the indexes) the indexes no longer apply.

What you want to do is parameterize the function to take in the parameter.

If you find that you have too many fields that you want to set parameters on, then you might want to take a look at indexed views, as you can create your projection and index it as well and then run queries against that.

casperOne
A couple points -- if my indexes weren't applying, wouldn't that cause poor performance in both the constant and the variable case? Also, the indexed view is a good idea but the ranking function rules that out as the query stands currently.
mwigdahl
@mwigdahl: I've update my post to indicate why the indexes won't apply when you filter on the projection (opposed to directly on the table). Also, can you create a UDF which uses the ranking function, then call that from the indexed view? Then create the query which filters on the indexed view?
casperOne
+1  A: 

Simply, the constant is easy to evaluate in the plan. The local variable is not. Especially with the ranking function and filter Predicate = 1

Paraphrasing casparOne, you need to push the filter as far inwards as possible so that you filter on dpv.Drug_package_version_id inside the iq derived table.

If you do that, then you also have no need for the PARTITION BY because you have only a single dpv.Drug_package_version_id. Then you can do a cleaner ...TOP 1 ... ORDER BY ndctbla.GCN_SEQNO DESC.

gbn
A: 

The responses I got were good, and I learned from them, but I think I've found an answer that satisfies me.

I do think it's the use of the PARTITION BY clause that is causing the problem here. I reformulated the UDF using a variant of the self-join idiom:

SELECT t1.A, t1.B, t1.C
FROM T t1
    INNER JOIN
    (
        SELECT A, MAX(C) AS C
        FROM T
        GROUP BY A
    ) t2 ON t1.A = t2.A AND t1.C = t2.C

Ironically, this is more performant than using the SQL 2008-specific query, and also the optimizer doesn't have a problem with joining this version of the query using variables rather than constants. At this point, I'm concluding that the optimizer just doesn't handle the more recent SQL extensions as well as the older stuff. As a bonus, I can make use of the UDF now, in my pre-upgraded SQL 2000 platforms.

Thanks for your help, everyone!

mwigdahl
There's no reason to think that new features to the query language will result in better performance. They've spent a lot more time tuning the classical expressions.
le dorfier