tags:

views:

303

answers:

4

I have a script that did double inserts into the database with the same data. Is there a good way to do this (without scanning through, inserting every record into an array, and then deleting duplicate array entries)?

+10  A: 
DELETE
FROM t
WHERE ID IN (
    SELECT MAX(ID)
    FROM t
    GROUP BY {Your Group Criteria Here}
    HAVING COUNT(*) > 1
)
Cade Roux
DELETE FROM Resources WHERE Id IN ( SELECT MAX( Id ) FROM Resources GROUP BY Filename HAVING COUNT( * ) >1 )MySQL said: Documentation#1093 - You can't specify target table 'Resources' for update in FROM clause
Issac Kelly
I think that this would work if I specified two tables like in Bill's answer.
Issac Kelly
Sorry it doesn't work on MySQL - I'm pretty sure this one's ANSI.
Cade Roux
+2  A: 

MySQL supports multi-table DELETE which is really cool and can help here. You can do a self-join on the equality of all columns except the id, and then delete the matching row with the greater id.

DELETE t2
 FROM mytable t1 JOIN mytable t2
  USING (column1, column2, column3) -- this is an equi-join
WHERE t1.id < t2.id;
Bill Karwin
+1  A: 

Or the old simple way, I'd be surprised if it's not fastest. Especially faster than matching a GROUP BY aggregate function.

DELETE FROM mytable m1
WHERE EXISTS
( SELECT 1 FROM mytable
WHERE fields = m1.fields
AND id < m1.id )

le dorfier
This assumes every pair of records is unique to themselves.
cLFlaVA
No, it works fine if there is more than two, or less than two (i.e. 1). The record with no record having a lower id is kept.
le dorfier
A: 

If you don't have anything referencing into the table by key right now, I'd mysqldump --complete-insert it, strip the primary keys, change the table definition to enforce some unique key or another that would catch your duplications, change the INSERTs to REPLACEs, and load the data back in. Gets you a nice clean table without holes in the PK sequence or deleted rows.

chaos