views:

102

answers:

2

hi

how can I delete duplicate rows in SQL Server 2008 ?

thanks in advance

A: 

Assuming you have a primary key called id and other columns are col2 ...coln, and that by "duplicate" rows you mean all rows where all column values except the PK are duplicated

delete from A where id not in
(select min(id) from A
group by col2, col3, ...coln) as x

i.e. group on all non-PK columns

davek
it not working, because i dont have primary key.how i can do it without primary key ?
Gold
+1  A: 

Add a primary key. Seriously, every table should have one. It can be an identity and you can ignore it, but make sure that every single table has a primary key defined.

Imagine that you have a table like:

create table T (
    id int identity,
    colA varchar(30) not null,
    colB varchar(30) not null
)

Then you can say something like:

delete T
from T t1
where exists
(select null from T t2
where t2.colA = t1.colA
and t2.colB = t1.colB
and t2.id <> t1.id)

Another trick is to select out the distinct records with the minimum id, and keep those:

delete T
where id not in
(select min(id) from T
group by colA, colB)

(Sorry, I haven't tested these, but one of these ideas could lead you to your solution.)

Note that if you don't have a primary key, the only other way to do this is to leverage a pseudo-column like ROWID -- but I'm not sure if SQL Server 2008 offers that idea.

AWhitford