tags:

views:

58

answers:

3

Write a SQL query to find those values from column A in table X that are NOT present in column B of table Y.

update: the query is taking too long with not in(more than 5 minutes, i didn't wait for it to complete) The column B in table Y is a primary key.

update: im using oracle. The table size is in millions(rows) for both tables. Of course, I'm using another WHERE clause, which means I'm comparing about 500,000 rows of table X with millions of rows in table Y

A: 

This should work:

Select A from X where not A in (select B from Y)
Ioannis Karadimas
+5  A: 

You can't rely on NOT IN if you any NULLS in Y.B

Select A from X where not EXISTS (select * from Y where Y.B = X.A)

The safest way in general is NOT EXISTS. An OUTER JOIN may give more more rows if, say, Y is a child of A

gbn
+2  A: 

There are 4 ways to do this that I can think of.

  • Not In (careful of NULLs)
  • OUTER JOIN and filter on NULL (may need DISTINCT added)
  • NOT EXISTS
  • EXCEPT (Following update in question Oracle uses the non standard MINUS operator for this)

This last one appears to be the preferred way of doing it in Oracle though I can't vouch for that myself.

Martin Smith
Worth nothing you may need DISTINCT with the OUTER JOIN...?
gbn
Note: EXCEPT only exists in SQL Server 2005 upwards.
Ardman
@Ardman EXCEPT is standard SQL though. The OP hasn't specified any particular flavour. Quick Google says it is supported in sybase http://dcx.sybase.com/1100en/dbusage_en11/set-operations-sorting.html and postgres sql http://www.postgresql.org/files/documentation/books/aw_pgsql/node80.html
Martin Smith