ansaurus

Question

Answer 1

+2 A:

To find which lastnames have duplicates:

  SELECT lastname, COUNT(lastname) AS rowcount 
    FROM table 
GROUP BY lastname 
  HAVING rowcount > 1

To delete one of the duplicates of all the last names. Run until it doesn't do anything. Not very graceful.

DELETE FROM table 
 WHERE id IN (SELECT id 
                FROM (SELECT * FROM table) AS t 
            GROUP BY lastname 
              HAVING COUNT(lastname) > 1)

scompt.com 2010-06-09 13:58:32

Now write that as a delete please. :)

Josh K 2010-06-09 14:42:16

I'm tempted to downvote simply because of the crappy second query. Surely there must be a simpler way then to re-run a query until it stops.

Josh K 2010-06-09 15:38:47

Answer 2

A:

dup http://stackoverflow.com/questions/18932/sql-how-can-i-remove-duplicate-rows

DELETE names
FROM names
LEFT OUTER JOIN (
   SELECT MIN(RowId) as RowId, lastname 
   FROM names
   GROUP BY lastname 
) as KeepRows ON
   names.lastname = KeepRows.lastname 
WHERE
   KeepRows.RowId IS NULL

assumption: you have an RowId column

Glennular 2010-06-09 14:00:47

I have a `id` column.

Josh K 2010-06-09 14:31:28

Answer 3

A:

SELECT COUNT(*) as mycountvar FROM names GROUP BY lastname WHERE mycountvar > 1;

and then

DELETE FROM names WHERE lastname = '$mylastnamevar' LIMIT $mycountvar-1

but: why don't you just flag the fielt "lastname" als unique, so it isn't possible that duplicates can come in?

oezi 2010-06-09 14:01:21

Because duplicates are already in the table. I'm trying to add `lastname` as a `UNIQUE INDEX`.

Josh K 2010-06-09 14:30:25

Answer 4

+2 A:

The fastest and easiest way to delete duplicate records is my issuing a very simple command.

ALTER IGNORE TABLE [TABLENAME] ADD UNIQUE INDEX UNIQUE_INDEX ([FIELDNAME])

This will lock the table, if this is an issue, try:

delete t1 from table1 t1, table2 t2
where table1.duplicate_field= table2.duplicate_field (add more if need ie. and table.duplicate_field2=table2.duplicate_field2)
and table1.unique_field > table2.unique_field
and breakup into ranges to run faster

Gary 2010-06-09 14:20:10

Locking the table isn't an issue. The issue is there already duplicate rows.

Josh K 2010-06-09 14:38:57

If locking is not an issue, then executing ALTER IGNORE TABLE [TABLENAME] ADD UNIQUE INDEX UNIQUE_INDEX ([FIELDNAME]) will rebuild the table and remove the duplicate records.

Gary 2010-06-09 15:07:28

You can't apply a constraint if the data doesn't satisfy it - your suggestion would not work.

OMG Ponies 2010-06-09 15:47:43

+1 and accepted. Locked the table temporarily and went to work. No duplicates and no more will be added.

Josh K 2010-06-09 15:58:24

OMG, it does work. The IGNORE is the key part of what you are missing.

Gary 2010-06-09 16:00:03

ansaurus

tags:

views:

answers:

Select a record that has a duplicate

related questions