My system does some pretty heavy processing, and I've been attacking the performance in order to give me the ability to run more test runs in shorter times.
I have quite a few cases where a UDF has to get called on say, 5 million rows (and I pretty much thought there was no way around it).
Well, it turns out, there is a way to work around it and it gives huge performance improvements when UDFs are called over a set of distinct parameters somewhat smaller than the total set of rows.
Consider a UDF that takes a set of inputs and returns a result based on complex logic, but for the set of inputs over 5m rows, there are only 100,000 distinct inputs, say, and so it will only produce 100,000 distinct result tuples (my particular cases vary from interest rates to complex code assignments, but they are all discrete - the fundamental point with this technique is that you can simply determine if the trick will work by running the SELECT DISTINCT
).
I found that by doing something like this:
INSERT INTO PreCalcs
SELECT param1
,param2
,dbo.udf_result(param1, param2) AS result
FROM (
SELECT DISTINCT param1, param2 FROM big_table
)
When PreCalcs is suitably indexed, the combination of that with:
SELECT big_table.param1
,big_table.param2
,PreCalcs.result
FROM big_table
INNER JOIN PreCalcs
ON PreCalcs.param1 = big_table.param1
AND PreCalcs.param2 = big_table.param2
You get a HUGE boost in performance. Apparently, just because something is deterministic, it doesn't mean SQL Server is caching the past calls and re-using them, as one might think.
The only thing you have to watch out for is where NULL are allowed, then you need to fix up your joins carefully:
SELECT big_table.param1
,big_table.param2
,PreCalcs.result
FROM big_table
INNER JOIN PreCalcs
ON (
PreCalcs.param1 = big_table.param1
OR COALESCE(PreCalcs.param1, big_table.param1) IS NULL
)
AND (
PreCalcs.param2 = big_table.param2
OR COALESCE(PreCalcs.param2, big_table.param2) IS NULL
)
Hope this helps and any similar tricks with UDFs, or refactoring queries for performance are welcome.
I guess the question is, why is manual caching like this necessary - isn't that the point of the server knowing that the function is deterministic? And if it makes such a big difference, and if UDFs are so expensive, why doesn't the optimizer just do it in the execution plan?