tags:

views:

606

answers:

5

I have a table with a varchar column, and I would like to find all the records that have duplicate values in this column. What is the best query I can use to find the duplicates?

+2  A: 

Do a SELECT with a GROUP BY clause. Let's say name is the column you want to find duplicates in:

SELECT name, COUNT(*) c FROM table GROUP BY name WHERE c > 1;

This will return a result with the name value in the first column, and a count of how many times that value appears in the second.

levik
Perfect - thanks.
Jon Tackabury
you should be getting errors if you write a query where the WHERE clause follows the GROUP BY clause...
HorusKol
+2  A: 
SELECT varchar_col
FROM table
GROUP BY varchar_col
HAVING count(*) > 1;
maxyfc
A: 

SELECT ColumnA, COUNT( * ) FROM Table GROUP BY ColumnA HAVING COUNT( * ) > 0

Scott Ferguson
A: 

Assuming your table is named TableABC and the column which you want is Col and the primary key to T1 is Key.

SELECT a.Key, b.Key, a.Col 
FROM TableABC a, TableABC b
WHERE a.Col = b.Col 
AND a.Key <> b.Key

The advantage of this approach over the above answer is it gives the Key.

StartClass0830
+1  A: 
SELECT  *
FROM    mytable mto
WHERE   EXISTS
        (
        SELECT  1
        FROM    mytable mti
        WHERE   mti.varchar_column = mto.varchar_column
        LIMIT 1, 1
        )

This query returns complete records, not just distinct varchar_column's.

This query doesn't use COUNT(*). If there are lots of duplicates, COUNT(*) is expensive, and you don't need the whole COUNT(*), you just need to know if there are two rows with same value.

Having an index on varchar_column will, of course, speed up this query greatly.

Quassnoi