tags:

views:

181

answers:

3

It's easy to find duplicates with one field

SELECT name, 
 COUNT(email) 
FROM users
GROUP BY email
HAVING ( COUNT(email) > 1 )

So if we have a table

ID   NAME   EMAIL
1    John   [email protected]
2    Sam    [email protected]
3    Tom    [email protected]
4    Bob    [email protected]
5    Tom    [email protected]

This query will give us John, Sam, Tom, Tom because they all have the same e-mails.

But what I want, is to get duplicates with the same e-mails and names.

I want to get Tom, Tom.

I made a mistake, and allowed to insert duplicate name and e-mail values. Now I need to remove/change the duplicates. But I need to find them first.

+8  A: 
SELECT
    name, email, COUNT(*)
FROM
    users
GROUP BY
    name, email
HAVING 
    COUNT(*) > 1

Simply group on the name/email combination?

gbn
Thank you!So easy
Alex
+4  A: 

Try this:

SELECT name, email
FROM users
GROUP BY name, email
HAVING ( COUNT(*) > 1 )
Chris Pebble
+2  A: 

try this:

declare @YourTable table (id int, name varchar(10), email varchar(50))

INSERT @YourTable VALUES (1,'John','John-email')
INSERT @YourTable VALUES (2,'John','John-email')
INSERT @YourTable VALUES (3,'fred','John-email')
INSERT @YourTable VALUES (4,'fred','fred-email')
INSERT @YourTable VALUES (5,'sam','sam-email')
INSERT @YourTable VALUES (6,'sam','sam-email')

SELECT
    name,email, COUNT(*) AS CountOf
    FROM @YourTable
    GROUP BY name,email
    HAVING COUNT(*)>1

OUTPUT:

name       email       CountOf
---------- ----------- -----------
John       John-email  2
sam        sam-email   2

(2 row(s) affected)

if you want the IDs of the dups use this:

SELECT
    y.id,y.name,y.email
    FROM @YourTable y
        INNER JOIN (SELECT
                        name,email, COUNT(*) AS CountOf
                        FROM @YourTable
                        GROUP BY name,email
                        HAVING COUNT(*)>1
                    ) dt ON y.name=dt.name and y.email=dt.email

OUTPUT:

id          name       email
----------- ---------- ------------
1           John       John-email
2           John       John-email
5           sam        sam-email
6           sam        sam-email

(4 row(s) affected)

to delete the duplicates try:

DELETE d
    FROM @YourTable d
        INNER JOIN (SELECT
                        y.id,y.name,y.email,ROW_NUMBER() OVER(PARTITION BY y.name,y.email ORDER BY y.name,y.email,y.id) AS RowRank
                        FROM @YourTable y
                            INNER JOIN (SELECT
                                            name,email, COUNT(*) AS CountOf
                                            FROM @YourTable
                                            GROUP BY name,email
                                            HAVING COUNT(*)>1
                                        ) dt ON y.name=dt.name and y.email=dt.email
                   ) dt2 ON d.id=dt2.id
        WHERE dt2.RowRank!=1
select * FROM @YourTable

OUTPUT:

id          name       email
----------- ---------- --------------
1           John       John-email
3           fred       John-email
4           fred       fred-email
5           sam        sam-email

(4 row(s) affected)
KM