ansaurus

Question

How to keep only one row of a table, removing duplicate rows?

Answer 1

A:

It would probably be easier to select the unique ones into a new table, drop the old table, then rename the temp table to replace it.

#create a table with same schema as members
CREATE TABLE tmp (...);

#insert the unique records
INSERT INTO tmp SELECT * FROM members GROUP BY name;

#swap it in
RENAME TABLE members TO members_old, tmp TO members;

#drop the old one
DROP TABLE members_old;

Paul Dixon 2009-08-17 08:53:01

Thanks Paul. For those interested...CREATE TEMP TABLE tmp_members (name VARCHAR);INSERT INTO tmp_members SELECT name FROM members GROUP BY name;SELECT COUNT(name) FROM tmp_members;DELETE FROM members;VACUUM members;SELECT COUNT(name) FROM members;INSERT INTO members (name) SELECT * FROM tmp_members;SELECT COUNT(name) FROM members;SELECT DISTINCT COUNT(name) FROM members;SELECT name FROM members LIMIT 10;DROP TABLE tmp_members;

OverTheRainbow 2009-08-17 09:11:01

Sorry, I missed that you were using SQLite!

Paul Dixon 2009-08-17 09:14:11

Answer 2

+2 A:

See the following question: Deleting duplicate rows from a table.

The adapted accepted answer from there (which is my answer, so no "theft" here...):

You can do it in a simple way assuming you have a unique ID field: you can delete all records that are the same except for the ID, but don't have "the minimum ID" for their name.

Example query:

DELETE FROM members
WHERE ID NOT IN
(
    SELECT MIN(ID)
    FROM members
    GROUP BY name
)

In case you don't have a unique index, my recommendation is to simply add an auto-incremental unique index. Mainly because it's good design, but also because it will allow you to run the query above.

Roee Adler 2009-08-17 09:01:09

Here's how I understand the above: For each name, it groups them (only one if unique; several into one if duplicates), selects the smallest ID from the set, and then deletes any row whose ID doesn't exist in the table.Brilliant :) Thanks much Rax.

OverTheRainbow 2009-08-17 09:16:52

You got it exactly :)

Roee Adler 2009-08-17 09:19:32

Answer 3

A:

We have a huge database where deleting duplicates is part of the regular maintenance process. We use DISTINCT to select the unique records then write them into a TEMPORARY TABLE. After TRUNCATE we write back the TEMPORARY data into the TABLE.

That is one way of doing it and works as a STORED PROCEDURE.

G Berdal 2009-08-17 09:06:30

I have to admit Rax Olgud's answer is much-much more sophisticated and probably runs 100 times quicker! :) - I'm thinking about adopting the solution... Deserves +1!

G Berdal 2009-08-17 13:00:12

ansaurus

tags:

views:

answers:

How to keep only one row of a table, removing duplicate rows?

related questions