views:

50

answers:

3

I have a table with some ids + titles. I want to make the title column unique, but it has over 600k records already, some of which are duplicates (sometimes several dozen times over).

How do I remove all duplicates, except one, so I can add a UNIQUE key to the title column after?

+3  A: 

Create a new table with just the distinct rows of the original table. There may be other ways but I find this the cleanest.

CREATE TABLE tmp_table AS SELECT DISTINCT [....] FROM main_table
nc3b
A: 

This shows how to do it in SQL2000. I'm not completely familiar with MySQL syntax but I'm sure there's something comparable

create table #titles (iid int identity (1, 1), title varchar(200))

-- Repeat this step many times to create duplicates
insert into #titles(title) values ('bob')
insert into #titles(title) values ('bob1')
insert into #titles(title) values ('bob2')
insert into #titles(title) values ('bob3')
insert into #titles(title) values ('bob4')


DELETE T  FROM 
#titles T left join 
(
  select title, min(iid) as minid from #titles group by title
) D on T.title = D.title and T.iid = D.minid
WHERE D.minid is null

Select * FROM #titles
souLTower
+5  A: 

This command adds a unique key, and drops all rows that generate errors (due to the unique key). This removes duplicates.

ALTER IGNORE TABLE table ADD UNIQUE KEY idx1(title); 
unutbu
This is a great method!
nc3b
This is very clever.
Yegor