views:

56

answers:

2

I would like to optimise my Doublon checker if anyone knows how it could be faster.

$doublonchecker="delete bad_rows.* from eMail as good_rows inner join eMail as bad_rows on bad_rows.EMAIL = good_rows.EMAIL and bad_rows.EMAIL_ID > good_rows.EMAIL_ID";
$resultdoublon = mysql_query($doublonchecker);
        if (!$resultdoublon) {
        die('Error : ' . mysql_error());
        }
+2  A: 

That query would use an index on (EMAIL) or on (EMAIL, EMAIL_ID). However, you shouldn't have to run that query more than once, so its performance shouldn't worry you too much. Once you have removed your "Doublons" just replace the index on (EMAIL) with a UNIQUE KEY on (EMAIL) and you'll never have duplicate emails ever again.

Josh Davis
+2  A: 

I usually do it the following way:

Delete from Email where ID not in 
(
Select Min(em.ID) minID
from eMail em  group by em.Email
)
Heiko Hatzfeld