ansaurus

Question

Answer 1

+5 A:

If you want to make that query really fast, you need to

turn the LEFT JOIN into an INNER JOIN

make sure the InvoiceDetail.AdjustDetailId and InvoiceDetail.InvoiceDetailId are indexed

SELECT 
  d.InvoiceDetailId, a.Fee, a.FeeTax
FROM 
  dbo.InvoiceDetail d
INNER JOIN 
  dbo.InvoiceDetail a ON a.AdjustDetailId = d.InvoiceDetailId

Next, you need to make sure your statistics are up to date, so that the cost-based query optimizer can work properly.

In order to update the statistics, use the UPDATE STATISTICS (table) command - see the MSDN docs on UPDATE STATISTICS here

marc_s 2009-12-22 17:13:12

The problem with that is that the inner join will remove all of the InvoiceDetail rows with no adjustment. AdjustDetailId is nullable by design. That would probably make me do a inner join with a union where AdjustDetailId is null

Jose 2009-12-22 17:16:43

true - but an outer join is always much slower than an INNER JOIN - if you really need it, yes, use a LEFT OUTER JOIN - but the other recommendations still apply

marc_s 2009-12-22 17:17:35

how do you make your statistics up to date?

Jose 2009-12-23 13:44:28

Answer 2

+2 A:

I would have guessed that they would be the same, (with the same execution plan) since it is impossible for a predicate like a.AdjustDetailId = d.InvoiceDetailId to be true if one side is null... So adding the Is Not Null condition is redundant. But maybe the processor is executing additional unnecessary steps with that additional predicate in there...

But what the other answer mentions is more important. Do you really need to output all the rows where there is no matching record (Invoices without a Adjusting Invoice) ?? If not change it to an Inner join and it will speed up a lot.

if you really need them, however, You might try a Union

  Select d.InvoiceDetailId,a.Fee,a.FeeTax
  From InvoiceDetail d
     Join InvoiceDetail a 
         On a.AdjustDetailId = d.InvoiceDetailId
  Union
  Select InvoiceDetailId, null, null
  from InvoiceDetail 
  Where AdjustDetailId Is Null

Which does the same thing without using an outer join... (It is problematic as to whether two queries with a union will run faster than the single outer join query... )

Charles Bretana 2009-12-22 17:15:45

That's exactly what it seems is my only alternative, I will check it out

Jose 2009-12-22 17:41:31

Union can be really slow because it removes duplicates. Union all is much faster.

Lluis Martinez 2009-12-22 19:59:05

Answer 3

+1 A:

You only have 1 table in this query, right?

If you use

select InvoiceDetailId, Fee, FeeTax from InvoiceDetail

That WILL get all the rows, not just the adjusted ones.

Asuming you are doing a self-join, and doing it for a good reason, I would index InvoiceDetailId and AdjustDetailId and see which index(es) the execution plan uses.

You could also try "include" the Fee and FeeTax columns in your index - this will help a lot if the table is really wide.

Brad 2009-12-22 19:44:49

Answer 4

+1 A:

For your queries, I can think of 3 different reasonable execution plans:

LOOP JOIN OUTER [a.AdjustDetailId = d.InvoiceDetailId]
    TABLE SCAN InvoiceDetail d
    TABLE SCAN InvoiceDetail a

HASH JOIN OUTER [a.AdjustDetailId = d.InvoiceDetailId]
    TABLE SCAN InvoiceDetail d
    TABLE SCAN InvoiceDetail a

LOOP JOIN OUTER
    HASH JOIN OUTER [x.AdjustDetailId = d.InvoiceDetailId] AS y
        TABLE SCAN InvoiceDetail d
        INDEX SEEK [InvoiceDetail, AdjustDetailId IS NOT NULL] x
    InvoiceDetail a [a.AdjustDetailId = y.AdjustDetailId]

Perhaps adding the IS NOT NULL condition makes the optimizer choose another one of the plans, it's hard to say.

erikkallen 2009-12-23 14:11:57

ansaurus

tags:

views:

answers:

Sql query optimization

related questions