views:

486

answers:

3
+4  Q: 

self join query

Consider the following table:

mysql> select * from phone_numbers;
+-------------+------+-----------+
| number      | type | person_id |
+-------------+------+-----------+
| 17182225465 | home |         1 |
| 19172225465 | cell |         1 |
| 12129876543 | home |         2 |
| 13049876543 | cell |         2 |
| 15064223454 | home |         3 |
| 15064223454 | cell |         3 |
| 18724356798 | home |         4 |
| 19174335465 | cell |         5 |
+-------------+------+-----------+

I'm trying to find those people who have home phones but not cells.

This query works:

mysql> select h.*
    -> from phone_numbers h
    -> left join phone_numbers c
    -> on h.person_id = c.person_id
    -> and c.type = 'cell'
    -> where h.type = 'home'
    -> and c.number is null;
+-------------+------+-----------+
| number      | type | person_id |
+-------------+------+-----------+
| 18724356798 | home |         4 |
+-------------+------+-----------+

but this one doesn't:

mysql> select h.*
    -> from phone_numbers h
    -> left join phone_numbers c
    -> on h.person_id = c.person_id
    -> and h.type = 'home'
    -> and c.type = 'cell'
    -> where c.number is null;
+-------------+------+-----------+
| number      | type | person_id |
+-------------+------+-----------+
| 19172225465 | cell |         1 |
| 13049876543 | cell |         2 |
| 15064223454 | cell |         3 |
| 18724356798 | home |         4 |
| 19174335465 | cell |         5 |
+-------------+------+-----------+

The only difference between the two is the location of the h.type = 'home' condition - in the first it's in the where clause and in the second it's part of the on clause.

Why doesn't the second query return the same result as the first?

A: 

I don't know if this will fix things or not, but...

The statements starting with "and" should be part of the WHERE clause, not part of the ON clause. The ON clause should only have statements involving which columns are used to join the tables.

R. Bemrose
you can add adicional conditions on the join. This will work as a where before de join takes place
Sergio
It's bad style to do so, and it causes things to be executed in the wrong order, as pointed out in Tony Andrews's answer.
R. Bemrose
+7  A: 

In the second SQL, the condition h.type = 'home' is part of the outer join conditions, and is not a filter on the results. For all records where h.type='cell', the condition h.type = 'home' is FALSE and so no "matching" c row is found - so c.number is null, which is your only filtering (WHERE) condition.

In pseudo-code your 2nd SQL works like this:

for each row in phone_numbers h /* Note this is ALL home AND cell phones */
   select c.number from phone_numbers c
   where h.person_id = c.person_id
   and h.type = 'home'
   and c.type = 'cell';
   if c.number is null (i.e. no row found)
     display h.*
   end if
end loop;
Tony Andrews
Great explanation. Thanks.
MCS
+2  A: 

When doing left joins I approach things this way. In the join you need to specify anny fields that actually link the two tables together and any filtering condition from the right side (2nd table in the join) of the join (with one exception, I'll get to shortly). Filtering conditions from the left side of the join(1st table) should be in the where clause or they will wrongly affect the join as you saw (and as Tony so nicely explained). The only time the right side of the join should be in the where clause is if you are looking for null values in that table (i.e., the records which are in the first table but not the second).

HLGEM