ansaurus

Question

How to delete duplicate rows in SQL Server 2008 ?

Answer 1

A:

Assuming you have a primary key called id and other columns are col2 ...coln, and that by "duplicate" rows you mean all rows where all column values except the PK are duplicated

delete from A where id not in
(select min(id) from A
group by col2, col3, ...coln) as x

i.e. group on all non-PK columns

davek 2009-10-31 19:32:52

it not working, because i dont have primary key.how i can do it without primary key ?

Gold 2009-10-31 20:40:55

Answer 2

+1 A:

Add a primary key. Seriously, every table should have one. It can be an identity and you can ignore it, but make sure that every single table has a primary key defined.

Imagine that you have a table like:

create table T (
    id int identity,
    colA varchar(30) not null,
    colB varchar(30) not null
)

Then you can say something like:

delete T
from T t1
where exists
(select null from T t2
where t2.colA = t1.colA
and t2.colB = t1.colB
and t2.id <> t1.id)

Another trick is to select out the distinct records with the minimum id, and keep those:

delete T
where id not in
(select min(id) from T
group by colA, colB)

(Sorry, I haven't tested these, but one of these ideas could lead you to your solution.)

Note that if you don't have a primary key, the only other way to do this is to leverage a pseudo-column like ROWID -- but I'm not sure if SQL Server 2008 offers that idea.

AWhitford 2009-10-31 22:07:02

ansaurus

tags:

views:

answers:

How to delete duplicate rows in SQL Server 2008 ?

related questions