views:

103

answers:

3

hi, I have two tables with following definition

    TableA                                          TableB
ID1   ID2   ID3  Value1 Value                      ID1 Value1
 C1    P1   S1                                     S1
 C1    P1   S2                                     S2
 C1    P1   S3                                     S3
 C1    P1   S5                                     S4
                                                   S5

The values are just examples in the table. TableA has a clustered primary key ID1, ID2 & ID3 and TableB has p.k. ID1 I need to create a table that has the missing records in TableA based on TableB The select query I am trying to create should give the following output

C1  P1  S4

To do this, I have the following SQL query

    SELECT
    DISTINCT TableA.ID1, TableA.ID2, TableB.ID1
    FROM TableA a, TableB b 
    WHERE TableB.ID1 NOT IN 
    ( 
      SELECT DISTINCT [ID3] 
      FROM TableA aa
      WHERE a.ID1 == aa.ID1 
        AND a.ID2 == aa.ID2 
    )

Though this query works, it performs poorly and my final TableA may have upto 1M records. is there a way to rewrite this more efficiently.

Thanks for any help, Javid

A: 

Assuming TableA is a joiner table, you'd need to generate the list of possible values with a cross join that could kill you anyway...

select TableC.ID1 as C, TableP.ID1 as P, TableB.ID1 as B
from TableC 
    cross join TableP
    cross join TableB

(This is a big assumption that these are relatively small tables)

Then you can join on this query like so

select ID1, ID2, ID3
from TableA A
    left outer join 
        (select TableC.ID1 as C, TableP.ID1 as P, TableB.ID1 as B
         from TableC 
             cross join TableP
             cross join TableB) X on A.ID1 = X.C and A.ID2 = X.P and A.ID3 = X.B
where X.ID3 is null
Austin Salonen
+1  A: 

Try this:

SELECT T1.ID1, T1.ID2, T2.ID1 FROM (
    SELECT ID1, ID2
    FROM TableA
    GROUP BY ID1, ID2
) T1
CROSS JOIN TableB T2
LEFT JOIN TableA T3
ON T1.ID1 = T3.ID1 AND T1.ID2 = T3.ID2 AND T2.ID1 = T3.ID3
WHERE T3.ID1 IS NULL
Mark Byers
A: 

Thanks to everyone for sharing your thoughts. I removed the DISTINCT keyword from the INNER select query and was able to get some performance. Also, I modified the query more and got some more performance. I thought would share my solution with everyone

Select c.*
FROM TAbleA a
RIGHT JOIN
( SELECT DISTINCT ID1, ID2, b.ID1 
  FROM TableA, 
  ( SELECT ID1 FROM TableB ) b 
) c
ON a.[ID1] = c.[ID1] AND a.[ID2] = c.[ID2] AND a.[ID3] = c.[ID3]
WHERE a.[ID3] IS NULL