views:

146

answers:

2

I would like to try using Outer Join to find out duplicates in a table:

If a table has Primary Index ID, then the following outer join can find out the duplicate names:

mysql> select * from gifts;
+--------+------------+-----------------+---------------------+
| giftID | name       | filename        | effectiveTime       |
+--------+------------+-----------------+---------------------+
|      2 | teddy bear | bear.jpg        | 2010-04-24 04:36:03 |
|      3 | coffee     | coffee123.jpg   | 2010-04-24 05:10:43 |
|      6 | beer       | beer_glass.png  | 2010-04-24 05:18:12 |
|     10 | heart      | heart_shape.jpg | 2010-04-24 05:11:29 |
|     11 | ice tea    | icetea.jpg      | 2010-04-24 05:19:53 |
|     12 | cash       | cash.png        | 2010-04-24 05:27:44 |
|     13 | chocolate  | choco.jpg       | 2010-04-25 04:04:31 |
|     14 | coffee     | latte.jpg       | 2010-04-27 05:49:52 |
|     15 | coffee     | espresso.jpg    | 2010-04-27 06:03:03 |
+--------+------------+-----------------+---------------------+
9 rows in set (0.00 sec)

mysql> select * from gifts g1 LEFT JOIN (select * from gifts group by name) g2 
         on g1.giftID = g2.giftID where g2.giftID IS NULL;
+--------+--------+--------------+---------------------+--------+------+----------+---------------+
| giftID | name   | filename     | effectiveTime       | giftID | name | filename | effectiveTime |
+--------+--------+--------------+---------------------+--------+------+----------+---------------+
|     14 | coffee | latte.jpg    | 2010-04-27 05:49:52 |   NULL | NULL | NULL     | NULL          |
|     15 | coffee | espresso.jpg | 2010-04-27 06:03:03 |   NULL | NULL | NULL     | NULL          |
+--------+--------+--------------+---------------------+--------+------+----------+---------------+
2 rows in set (0.00 sec)

But what if the table doesn't have a Primary Index ID, then can an outer join still be used to find out duplicates?

P.S. thanks for any non-outer-join solutions as well. If possible, I'd like to check whether it can be done using an outer join when there is no Primary ID index. thanks for helping.

+2  A: 

Use EXISTS clause:

SELECT  *
FROM    gifts go
WHERE   EXISTS
        (
        SELECT  NULL
        FROM    gifts gi
        WHERE   gi.name = go.name
        LIMIT 1, 1
        )
Quassnoi
How does that give you duplicates? The way I see it, it will return all records in the gifts table..
Miky Dinescu
@Miky: note `LIMIT 1, 1` in the `EXISTS` subquery. This means "check if *second* record with this name exists"
Quassnoi
My mistake.. I was thinking LIMIT starts at 1 (i.e. take only one record starting with the first). There's your +1 :)
Miky Dinescu
+2  A: 

You can use grouping to find out the duplicate values:

select name
from gifts
group by name
having count(*) > 1
Guffa
This will return the duplicate names, not records. To get records, you need an extra join.
Quassnoi
@Quassnoi: Yes, that's what I said it does. I offered this solution as the OP is not clear about whether it's the duplicate values or the duplicate records that is the goal, and this is simpler than getting the duplicate records.
Guffa
@Guffa: in his original query, the @op already has the `GROUP BY` subquery similar to yours. The point of the joins he used was to return the duplicate records.
Quassnoi