tags:

views:

46

answers:

2

I want to get redundant records from the database. Is my query correct for this?

select (fields) 
from DB 
group by name, city
having count(*) > 1

If wrong please let me know how can I correct this.

Also if I want to delete duplicate record will it work?

delete from tbl_name 
where row_id in 
   (select row_id from tbl_name group by name, city having count(*) > 1)

so i can make the above query like this

DELETE FROM tb_name where row_id not in(select min(row_id) from tb_name groupBy(name, city) having count(*)>1)
+1  A: 

You have the syntax wrong:

select name, city, count(*) from table group by name, city having count(*) > 1

If you are not interested in the actual count, remove ", count(*)" from the query

Vinko Vrsalovic
how else can i make query to get duplicate record?
NoviceToDotNet
Why do you need an alternative?
Vinko Vrsalovic
i am seeking for some other way to creat it too
NoviceToDotNet
no i just looking for logic, syntax is not an issue but still i am looking for some other option to make it. Can u provide me some where i can learn all good query practice of this kind?
NoviceToDotNet
@NoviceToDotNet: Microsoft itself recommends this: http://support.microsoft.com/kb/139444 Now, why do you need an alternative? Is it too slow? Use nolock: http://www.knowdotnet.com/articles/sqlforduplicates.html
Vinko Vrsalovic
+1  A: 

Your DELETE syntax is definitely totally wrong - that won't work ever. What it'll do is delete all rows that have more than one occurence - not leaving any data around...

What you can do in SQL Server 2005 and up is use a CTE (Common Table Expression) and the ROW_NUMBER() ranking function:

;WITH Duplicates AS
(
    SELECT 
      Name, City, 
      ROW_NUMBER() OVER (PARTITION BY Name, City ORDER BY City) AS 'RowNum'
)
DELETE FROM dbo.YourTable
WHERE RowNum > 1

You basically create "partitions" of your data by the (name, city) combo - each of those pairs will get sequential numbers from 1 on up.

Those that have more than one occurence will also have entries in that CTE with a RowNum > 1 - just delete all of those and your duplicates are done!

Read about Using Common Table Expressions in SQL Server 2005 and about Ranking Functions and Performance in SQL Server 2005 (or consult the MSDN docs on those topics)

marc_s
so i can make it like this DELETE FROM tb_name where row_id not in(select min(row_id) from tb_name groupBy(name, city) having count(*)>1)
NoviceToDotNet
@NoviceToDotNet: you can't make it that way - you need to use the CTE approach I showed in my answer - that will delete the duplicates from your table
marc_s