ansaurus

Question

Answer 1

+1 A:

SELECT *, COUNT(*) AS Count
FROM table
GROUP BY location_id, datetime
HAVING Count > 2

Sjoerd 2010-03-05 10:19:00

Answer 2

A:

UPDATE table SET datetime  = null 
WHERE location_id IN (
SELECT location_id 
FROM table as tableBis
WHERE tableBis.location_id = table.location_id
AND table.datetime > tableBis.datetime)

SELECT * INTO tableCopyWithNoDuplicate FROM table WHERE datetime is not null

DROp TABLE table 

RENAME tableCopyWithNoDuplicate to table

So you keep the line with the lower datetime. I'm not sure about perf, it depends on your table column, your server etc...

remi bourgarel 2010-03-05 10:25:00

Answer 3

+5 A:

I think you can use this query to delete the duplicate records from the table

ALTER IGNORE TABLE table_name ADD UNIQUE (location_id, datetime)

Before doing this, just test with some sample data first..and then Try this....

Vinodkumar ChandraSekar 2010-03-05 10:32:32

This looks promising, I hadn't heard about this feature before. Trying it now, I'll let you know how it turns out. And welcome to SO :)

Tatu Ulmanen 2010-03-05 11:19:53

This worked, thank you. Took 31 minutes to go through 16 982 040 rows with 1 589 908 duplicates. I can't believe it could be this simple, with no additional tables or complex queries. :)

Tatu Ulmanen 2010-03-05 12:10:29

ansaurus

tags:

views:

answers:

Deleting duplicates from a large table

related questions