views:

61

answers:

4

When using a SQL join, is it possible to keep only rows that have a single row for the left table?

For example:

select * from A, B where A.id = B.a_id;

a1 b1
a2 b1
a2 b2

In this case, I want to remove all except the first row, where a single row from A matched exactly 1 row from B.

I'm using MySQL.

+4  A: 

This should work in MySQL:

select * from A, B where A.id = B.a_id GROUP BY A.id HAVING COUNT(*) = 1;

For those of you not using MySQL, you will need to use aggregate functions (like min() or max()) on all the columns (except A.id) so your database engine doesn't complain.

Jack
It doesn't work correctly if B has more than 1 matching row. You need to remove the HAVING clause and then it should be fine (although I definitely recommend always using aggregate functions for any non-GROUPed columns).
dportas
@Jack, apologies. I seem to have misread what was wanted. It looks like your query is good and mine not, but just in case it helps I'll let mine stand for now :)
dportas
A: 

It helps if you specify the keys of your tables when asking a question such as this. It isn't obvious from your example what the key of B might be (assuming it has one).

Here's a possible solution assuming that ID is a candidate key of table B.

SELECT *
FROM A, B
WHERE B.id =
 (SELECT MIN(B.id)
  FROM B
  WHERE A.id = B.a_id);
dportas
A: 

First, I would recommend using the JOIN syntax instead of the outdated syntax of separating tables by commas. Second, if A.id is the primary key of the table A, then you need only inspect table B for duplicates:

Select ...
From A
    Join B
        On B.a_id = A.id
Where Exists    (
                Select 1
                From B B2
                Where B2.a_id = A.id
                Having Count(*) = 1
                )
Thomas
A: 

This avoids the cost of counting matching rows, which can be expensive for large tables.

As usual, when comparing various possible solutions, benchmarking / comparing the execution plans is suggested.

select
  *
from
  A
  join B on A.id = B.a_id
where
  not exists (
    select
      1
    from
      B B2
    where 
      A.id = b2.a_id 
      and b2.id != b.id
  )
Donnie