views:

339

answers:

3

SQL Server 2008. I have this very large query which has a high cost associated with it. TOAD has Query Tuning functionality in it and the only change made was the following:

Before:

LEFT OUTER JOIN (SELECT RIN_EXT.rejected, 
                               RIN_EXT.scar, 
                               RIN.fcreceiver, 
                               RIN.fcitemno 
                          FROM RCINSP_EXT RIN_EXT 
                               INNER JOIN dbo.rcinsp RIN 
                                  ON RIN_EXT.fkey_id = RIN.identity_column) RIN1 
         ON RCI.freceiver = RIN1.fcreceiver 
            AND RCI.fitemno = RIN1.fcitemno 
 WHERE RED.[YEAR] = '2009'

After:

LEFT OUTER JOIN (SELECT RIN_EXT.rejected, 
                               RIN_EXT.scar, 
                               RIN.fcreceiver, 
                               RIN.fcitemno 
                          FROM dbo.rcinsp RIN 
                               INNER JOIN RCINSP_EXT RIN_EXT 
                                  ON RIN.identity_column = COALESCE (RIN_EXT.fkey_id , RIN_EXT.fkey_id)) RIN1 
         ON RCI.freceiver = RIN1.fcreceiver 
            AND RCI.fitemno >= RIN1.fcitemno  -- ***** RIGHT HERE 
            AND RCI.fitemno <= RIN1.fcitemno
 WHERE RED.[YEAR] = '2009'

The field is a char(3) field and this is SQL Server 2008.

Any idea why theirs is so much faster than mine?

+2  A: 

Looks like an ascending index search argument thing - since it added a >= - We're not seeing enough about the rest of your query, but obviously there is further data about RCI.fitemno which it was able to deduce from the rest of your query.

It's odd that this:

        AND RCI.fitemno >= RIN1.fcitemno  -- ***** RIGHT HERE 
        AND RCI.fitemno <= RIN1.fcitemno

was not turned into this:

        AND RCI.fitemno = RIN1.fcitemno

Since they are equivalent.

Cade Roux
+2  A: 

You didn't show the ON condition in the "Before" query, so I don't know what TOAD changed. However, I'll take a guess about what happened.

The SQL Server query optimizer uses cost estimates to choose the query plan. The cost estimates are based on rowcount estimates. If the rowcount estimates are not accurate, the optimizer might not choose the best plan.

Some rowcount estimates are typically accurate, like those of the form (column = value) for a column with statistics. However, some rowcount estimates can only be guessed at, like (column = othercolumn) if the columns aren't related by a foreign key constraint, or (expression = value), where the expression isn't trivial or involves more than one column.

When statistics don't guide a rowcount estimate, SQL Server uses generic estimates. If you compare the rowcount estimates in an estimated plan to the actual rowcounts in an actual plan, you can sometimes see the problem and "trick" the optimizer into changing its rowcount estimate.

If you add predicates with AND that don't actually restrict the results, you may lower the rowcount estimate if the optimizer can't recognize that they are superfluous. Similarly, if you add predicates with OR that don't actually yield additional results, you may raise the rowcount estimate.

Perhaps here the rowcount estimate was too high, and the extra predicates are correcting them, resulting in better cost estimates for the plans being considered and a better plan choice in the end.

Steve Kass
+2  A: 

Adding larger-than and smaller-than in a query is an old trick which sometimes nudges the query optimizer to use an index on that column. So this trick:

AND RCI.fitemno >= RIN1.fcitemno
AND RCI.fitemno <= RIN1.fcitemno

forces the database to use indexes on RIN1 and RCI fitemno columns if present. I'm not sure if temporary indexes get created on the fly when you do this.

I used to do these tricks with a DB2 database, and they worked nicely.

Rolf