ansaurus

Question

Working around UDF Performance Issues - Manual caching

Answer 1

+1 A:

How would SQL Server know that you have 100,000 discrete combinations within 5 million rows?

By using the PreCalcs table, you are simply running the udf over 100k rows rather that 5 million rows, before expanding back out again.

No optimiser in existence would be able to divine this useful information. The scalar udf is a black box.

For a more practical solution, I'd use a computed, persisted columns that does the udf call. So it's available in all queries can be indexed/included.

This suits OLTP more, maybe... I query a table to get trading cash and positions in real time in many different ways so this approach suits me to avoid the udf math overhead every time.

gbn 2009-02-03 14:40:43

Well, it certainly does know, since it's called the UDF 5m times. What is the point of the database knowing that a UDF is deterministic if that is never used to avoid calling it over and over again with the same parameters?

Cade Roux 2009-02-03 14:46:01

Now I can understand if it only caches a certain number of results and there are cache misses, but it appears to do, basically, nothing.

Cade Roux 2009-02-03 14:49:26

It does not cache. Deterministic means same output for same inputs (close enough), but row to row (5 million) this is not persisted. The optimiser does not keep this information".

gbn 2009-02-03 15:10:32

Also, you say it gives the same output for same inputs. Does SQL know that? SELECT OBJECTPROPERTYEX(OBJECT_ID('udf_result'), 'IsDeterministic')

gbn 2009-02-03 15:11:19

Yes, all my UDFs are deterministic - but apparently that information is not used for anything.

Cade Roux 2009-02-03 15:46:03

I have a System_Health SP which warns me about heap tables, non-deterministic UDFs, unused indexes, etc.

Cade Roux 2009-02-03 15:49:03

It is: for indexed views and indexed computed columns... which is what I'd do here.

gbn 2009-02-03 15:49:52

The problem is I can't. A lot of this calculation work should have been done in the ETL and stored in the DW, but all I have is views to a DW I'm not allowed to change. I'm stuck in another database, so no indexed views possible for me.

Cade Roux 2009-02-03 15:52:24

We're mostly OLTP so it suits. But the nested/staging approach works so sometimes that's what you have to do...

gbn 2009-02-03 16:23:28

Answer 2

+2 A:

Yes, the optimizer will not manually memoize UDFs for you. Your trick is very nice in the cases where you can collapse the output set down in this way.

Another technique that can improve performance if your UDF's parameters are indices into other tables, and the UDF selects values from those tables to calculate the scalar result, is to rewrite your scalar UDF as a table-valued UDF that selects the result value over all your potential parameters.

I've used this approach when the tables we based the UDF query on were subject to a lot of inserts and updates, the involved query was relatively complex, and the number of rows the original UDF had to be applied to were large. You can achieve some great improvement in performance in this case, as the table-values UDF only needs to be run once and can run as an optimized set-oriented query.

mwigdahl 2009-02-03 15:44:08

Yes, unfortunately, all the potential parameters is the problem. I'm trying to re-write some of this code to be natively table-driven. But in other cases, the original logic is being very resistant to refactoring. I'm finding the UDFs to be very poorly performing to the point of uselessness.

Cade Roux 2009-02-03 15:48:01

ansaurus

tags:

views:

answers:

Working around UDF Performance Issues - Manual caching

related questions