views:

496

answers:

4

I have a table in my database which has duplicate records that I want to delete. I don't want to create a new table with distinct entries for this. What I want is to delete duplicate entries from the existing table without the creation of any new table. Is there any way to do this?

id action L1_name L1_data L2_name L2_data L3_name L3_data L4_name L4_data L5_name L5_data L6_name L6_data L7_name L7_data L8_name L8_data L9_name L9_data L10_name L10_data L11_name L11_data L12_name L12_data L13_name L13_data L14_name L14_data L15_name L15_data

see these all are my fields. id is unique for every row. L11_data is unique for respective action field. L11_data is having company names while action is having name of the industries. SO in my data i m having duplicate name of the companies in L11_data for their respective industries.What i want is to have is unique name and other data of the comapnies in the particular industry stored in action.I hope i have stated my problem in a way that you people can understand it.

A: 

This article should help you out.

James
You should really add a summary of the article to your answer. Then, if the article is moved (or removed), your answer would still be useful.
tvanfosson
thats what i dont want.......i dont want to create a temporary table with distinct data but i want to alter my existing table by delete duplicated records.
developer
+10  A: 

Yes, assuming you have a unique ID field, you can delete all records that are the same except for the ID, but don't have "the minimum ID" for their group of values.

Example query:

DELETE FROM Table
WHERE ID NOT IN
(
SELECT MIN(ID)
FROM Table
GROUP BY Field1, Field2, Field3, ...
)

Notes:

  • I freely chose "Table" and "ID" as representative names
  • The list of fields ("Field1, Field2, ...") should include all fields except for the ID
  • This may be a slow query depending on the number of fields and rows, however I expect it would be okay compared to alternatives

EDIT: In case you don't have a unique index, my recommendation is to simply add an auto-incremental unique index. Mainly because it's good design, but also because it will allow you to run the query above.

Roee Adler
This is cool as long as ID is numeric
Svetlozar Angelov
IDs are usually numeric so it should not be a problem, however actually it will work as long as "MIN" is defined on ID it will work. If it's defined on strings, and the field is unique, it will work great.
Roee Adler
I like your solution.. just wanted to clarify... it will be a problem if the table doesn't have a unique index too, it's good to have multiple options for a problem ..
Svetlozar Angelov
@Svetilo: You just gave me an idea for how to deal with no unique index...
Roee Adler
+2  A: 
ALTER IGNORE TABLE 'table' ADD UNIQUE INDEX(your cols);

Duplicates get NULL, then you can delete them

Svetlozar Angelov
A: 

delete from table_x a where rowid < any (select rowid from table_x b where a.someField = b.someField and a.someOtherField = b.someOtherField) where (a.someField, a.someOtherField) in (select c.someField, c.someOtherField from table_x c group by c.someField, c.someOtherField having count(*) > 1)

In above query the combination of someField and someOtherField must identify the duplicates distinctively.

Priyank