tags:

views:

713

answers:

6

What is the best way to determine duplicate records in a SQL Server table?

For instance, I want to find the last duplicate email received in a table (table has primary key, receiveddate and email fields).

Sample data:

1 01/01/2008 [email protected]

2 02/01/2008 [email protected]

3 01/12/2008 [email protected]

A: 

Couldn't you join the list on the e-mail field and then see what nulls you get in your result?

Or better yet, count the instances of each e-mail address? And only return the ones with count > 1

Or even take the email and id fields. And return the entries where the e-mail is the same, and the IDs are different. (To avoid duplicates don't use != but rather either < or >.)

Rob Rolnick
+8  A: 

something like this

select email ,max(receiveddate) as MaxDate
from YourTable
group by email 
having count(email) > 1
SQLMenace
A: 

select * from table a, table b where a.email = b.email

Michael Sharek
A: 
SELECT [id], [receivedate], [email]
FROM [mytable]
WHERE [email] IN ( SELECT [email]
    FROM [myTable]
    GROUP BY [email]
    HAVING COUNT([email]) > 1 )
palehorse
A: 

Do you want a list of the last items? If so you could use:

SELECT [info] FROM [table] t WHERE NOT EXISTS (SELECT * FROM [table] tCheck WHERE t.date > tCheck.date)

If you want a list of all duplicate email address use GROUP BY to collect similar data, then a HAVING clause to make sure the quantity is more than 1:

SELECT [info] FROM [table] GROUP BY [email] HAVING Count(*) > 1 DESC

If you want the last duplicate e-mail (a single result) you simply add a "TOP 1" and "ORDER BY":

SELECT TOP 1 [info] FROM [table] GROUP BY [email] HAVING Count(*) > 1 ORDER BY Date DESC
enigmatic
A: 

If you have surrogate key, it is relatively easy to use the group by syntax mentioned in SQLMenance's post. Essentially, group by all the fields that make two or more rows 'the same'.

Example pseudo-code to delete duplicate records.

Create table people (ID(PK), Name, Address, DOB)

Delete from people where id not in (
Select min(ID) from people group by name, address, dob
)
Brian