tags:

views:

54

answers:

3

For fun I been playing with the built-in Optimizer for Oracle in Toad today. One of the optimizations it suggests is the following

AND emp.pay_type = NVL('FT', UID)

Instead of

AND emp.pay_type = 'FT'

I'm confused on what is happening here and because of it, confused on also on why this would improve performance. Since FT is a string literal within the SQL query and therefore never NULL, why would this make any difference? I'm guessing it has something to do with the existing Index on the field, but am unable to find anything in the Oracle docs.

+3  A: 

That is weird advice. The NVL function works like this:

NVL(exp1, val1)

If 'exp1' is not null, it is returned; otherwise 'val1' is returned.

Since 'FT' in the example cannot be NULL, there is no benefit to using the NVL function, and a small performance penalty (at least for the optimizer to work out that NVL is redundant; maybe an execution penalty if the optimizer does not work out that NVL is redundant).

If the condition read:

AND emp.pay_type = NVL("FT", UID)

then there might be a benefit; here we have a delimited identifier (column name enclosed in double quotes) and the column value could perhaps be NULL; the NVL call ensures that NULL is returned only if "FT" is NULL and UID is NULL. UID is, of course, a regular identifier.

It might make sense if the condition read:

AND emp.pay_type = NVL(UID, 'FT')

Now if the UID value is NULL, then a default value 'FT' is used as the corresponding pay_type.

Jonathan Leffler
+1: The use of [NVL in this context (besides the fact of never triggering the latter portion) is bad practice because it makes the query non-sargable](http://stackoverflow.com/questions/799584/what-makes-a-sql-statement-sargable)
OMG Ponies
+1  A: 

What does Oracle's Explain Plan tell you about the two queries?

DCookie
+2  A: 

I'd take toad's "optimization suggestions" with huge grains of salt. I call them the "shotgun" method of optimization - shoot a whole lot of different bits at the target and see which one of them hits, somehow :)

Anyway, the difference between

AND emp.pay_type = NVL('FT', UID)

and

AND emp.pay_type = 'FT'

Is that in the second case, the optimizer may use the histogram statistics on the column (if they've been gathered) to obtain a more accurate estimate of the number of matching rows. When the NVL is used, however, the optimizer will (I believe) not detect that it can ignore the NVL, and will therefore not check the histogram for the value.

This is not an optimization method I would generally use. There are better ways to control the execution path of a query (hints, for instance). In particular, this method would fail if they improved the CBO to look for and optimize redundant code like NVL([literal value],[anything]) to [literal value].

Jeffrey Kemp