tags:

views:

2322

answers:

3

Hi,

I've got the following two SQL tables (in MySQL):

Phone_book
+----+------+--------------+
| id | name | phone_number |
+----+------+--------------+
| 1  | John | 111111111111 |
+----+------+--------------+
| 2  | Jane | 222222222222 |
+----+------+--------------+

Call
+----+------+--------------+
| id | date | phone_number |
+----+------+--------------+
| 1  | 0945 | 111111111111 |
+----+------+--------------+
| 2  | 0950 | 222222222222 |
+----+------+--------------+
| 3  | 1045 | 333333333333 |
+----+------+--------------+

How do I find out which calls were made by people whose phone_number is not in the Phone_book? The desired output would be:

Call
+----+------+--------------+
| id | date | phone_number |
+----+------+--------------+
| 3  | 1045 | 333333333333 |
+----+------+--------------+

Any help would be much appreciated.

+5  A: 

There's several different ways of doing this, with varying efficiency, depending on how good your query optimiser is, and the relative size of your two tables:

This is the shortest statement, and may be quickest if your phone book is very short:

SELECT  *
FROM    Call
WHERE   phone_number NOT IN (SELECT phone_number FROM Phone_book)

alternatively (thanks to Alterlife)

SELECT *
FROM   Call
WHERE  NOT EXISTS
  (SELECT *
   FROM   Phone_book
   WHERE  Phone_book.phone_number = Call.phone_number)

or (thanks to Kieran)

SELECT * 
FROM   Call
LEFT OUTER JOIN Phone_Book
  ON (Call.phone_number = Phone_book.phone_number)
  WHERE Phone_book.phone_number IS NULL

(ignoring that, as others have said, it's normally best to select just the columns you want, not '*')

Alnitak
avoid IN, use EXISTS - the hint is in the question title
annakata
The left outer join is probably quickest in the general case as it prevents repeated execution of the subquery.
WOPR
Not to be picky, but the subquery on my suggestion returns <code>select 'x'</code> and not <code>select *</code>
Alterlife
yes - MySQL manual suggests that this is normal for an 'EXISTS' query
Alnitak
+5  A: 

The could below would be a bit more efficient than the answers presented when dealing with larger datasets:

SELECT * FROM Call WHERE 
    NOT EXISTS (SELECT 'x' FROM Phone_book where 
        Phone_book.phone_number = Call.phone_number)
Alterlife
As always, it's worth profiling the performance of the queries against your target dataset to choose the one with the best performance. SQL optimisers are good enough these days that the performance results are often surprising.
Greg Hewgill
This definately is more optimized query +1
Dheer
EXISTS is considerably better performing than IN +1
annakata
NOT EXISTS <> EXISTS
WOPR
+6  A: 
SELECT Call.ID, Call.date, Call.phone_number 
FROM Call 
LEFT OUTER JOIN Phone_Book 
  ON (Call.phone_number=Phone_book.phone_number) 
  WHERE Phone_book.phone_number IS NULL

Should remove the subquery, allowing the query optimiser to work it's magic.

Also, avoid "SELECT *" because it can break your code if someone alters the underlying tables or views (and it's inefficient).

WOPR
This is generally the most efficient method as it does not perform multiple passes on the second table ... hope some people are reading the comemnts.
Nerdfest
I would rather hope that people profile: unless you're a top SQL performance guru, telling in advance what will be the fastest is quite difficult (and depends on the DBMS engine you use).
bortzmeyer