ansaurus

Question

Answer 1

A:

I have this query snipet for SQLServer but I think It can be used in others DBMS with little changes:

DELETE
FROM Table
WHERE Table.idTable IN  (  
    SELECT MAX(idTable)
    FROM idTable
    GROUP BY field1, field2, field3
    HAVING COUNT(*) > 1)

I forgot to tell you that this query doesn't remove the row with the lowest id of the duplicated rows. If this works for you try this query:

DELETE
FROM jobs
WHERE jobs.id IN  (  
    SELECT MAX(id)
    FROM jobs
    GROUP BY site_id, company, title, location
    HAVING COUNT(*) > 1)

eiefai 2010-07-22 18:22:08

That won't work if there's more than two duplicates of a group.

OMG Ponies 2010-07-22 18:23:42

Unfortunately, MySQL does not allow you to select from the table you are deleting from `ERROR 1093: You can't specify target table 'Table' for update in FROM clause`

Andomar 2010-07-22 18:29:06

OMG Ponies, I know that, this is just a snipet that I use sometimes and seemed to fit the question, thats why I said that It needed to be changed. Thanks for the comment.Andomar, I didn't know that. Thanks to you too.

eiefai 2010-07-22 18:43:45

Answer 2

+2 A:

A really easy way to do this is to add a UNIQUE index on the 3 columns. When you write the ALTER statement, include the IGNORE keyword. Like so:

ALTER IGNORE TABLE jobs ADD UNIQUE INDEX idx_name (site_id, title, company );

This will drop all the duplicate rows. As an added benefit, future INSERTs that are duplicates will error out. As always, you may want to take a backup before running something like this...

Chris Henry 2010-07-22 18:24:05

[Interesting](http://dev.mysql.com/doc/refman/5.1/en/alter-table.html), but the assumptions the IGNORE clause makes for removing those duplicates is a concern that might not match needs. Incorrect values being truncated to the closest acceptable match sound good to you?

OMG Ponies 2010-07-22 18:32:34

In this particular case, that's definitely true. The collation of the title and company columns definitely matter. What, exactly, does incorrect values mean? I smell another question...

Chris Henry 2010-07-22 19:08:21

this did the job, thanks a lot!

Chetan 2010-07-22 19:26:19

Answer 3

+2 A:

MySQL has restrictions about referring to the table you are deleting from. You can work around that with a temporary table, like:

create temporary table tmpTable (id int);

insert  tmpTable
        (id)
select  id
from    YourTable yt
where   exists
        (
        select  *
        from    YourTabe yt2
        where   yt2.title = yt.title
                and yt2.company = yt.company
                and yt2.site_id = yt.site_id
                and yt2.id > yt.id
        );

delete  
from    YourTable
where   ID in (select id from tmpTable);

Andomar 2010-07-22 18:26:48

+1: Your MySQL-fu is better than mine

OMG Ponies 2010-07-22 18:35:22

ansaurus

tags:

views:

answers:

Remove duplicate rows in MySQL

related questions