views:

49

answers:

1

I'm trying to isolate duplicates in a 500MB database and have tried two ways to do it. One creating a new table and grouping:

CREATE TABLE test_table as
SELECT * FROM items WHERE 1 GROUP BY title;

But it's been running for an hour and in MySQL Admin it says the status is Locked.

The other way I tried was to delete duplicates with this:

 DELETE bad_rows.*
        from items as bad_rows
        inner join (
                select post_title, MIN(id) as min_id
                from items
                group by title
                having count(*) > 1
        ) as good_rows on good_rows.post_title = bad_rows.post_title;

..and this has been running for 24hours now, Admin telling me it's Sending data...

Do you think either or these queries are actually still running? How can I find out if it's hung? (with Apple OS X 10.5.7)

+1  A: 

You can do this:

alter ignore table items add unique index(title); 

This will add a unique index and at the same time remove any duplicates, which will prevent any future duplicates from occurring. Make sure you do a backup before running this command.

RedFilter
Nice solution! Have to change any TEXT or BLOB fields to VARCHAR before indexing though. Thanks!
Matt Jarvis