ansaurus

Question

Answer 1

A:

First use a SELECT... INTO:

SELECT DISTINCT ProductID, ProductName, Description, Category
    INTO tblProductClean
    FROM tblProduct

The drop the first table.

eykanal 2010-07-27 15:37:37

From the OP: "Can it be done without creating temp tables? Just in one single query?"

dcp 2010-07-27 15:38:32

Answer 2

+4 A:

DELETE tblProduct 
FROM tblProduct 
LEFT OUTER JOIN (
   SELECT MIN(ProductId) as ProductId, ProductName, Description, Category
   FROM tblProduct 
   GROUP BY ProductName, Description, Category
) as KeepRows ON
   tblProduct.ProductId= KeepRows.ProductId
WHERE
   KeepRows.ProductId IS NULL

Stolen from http://stackoverflow.com/questions/18932/sql-how-can-i-remove-duplicate-rows

UPDATE:

This will only work if ProductId is a Primary Key (which it is not). You are better off using @marc_s' method, but I'll leave this up in case someone using a PK comes across this post.

Abe Miessler 2010-07-27 15:40:16

@Abe: `rowid` was the primary key for the table; I thought this was Oracle syntax for a moment till I saw the link.

OMG Ponies 2010-07-27 15:43:17

I was assuming that ProductId was a primary key in his table. I've updated it with his column names to help avoid any confusion.

Abe Miessler 2010-07-27 16:01:55

Nice Abe Miessler. Voted

Ankit Rathod 2010-07-27 16:06:43

@Abe Miessler, i thought it would work but it did sound confusing to me. So i tested in Managemenet Studio and indeed it is not working. It says 0 rows deleted. Can you fix the query?

Ankit Rathod 2010-07-27 16:23:39

@Nitesh, I assumed (mistakenly) that ProductId would be a unique identifier. Since it is not you are better off using @marc_s' method. Sorry for the confusion!

Abe Miessler 2010-07-27 16:29:05

No problem Abe Miessler.

Ankit Rathod 2010-07-27 16:32:01

Answer 3

+1 A:

I had to do this a few weeks back... what version of SQL Server are you using? In SQL Server 2005 and up, you can use Row_Number as part of your select, and only select where Row_Number is 1. I forget the exact syntax, but it's well documented... something along the lines of:

Select t0.ProductID, 
       t0.ProductName, 
       t0.Description, 
       t0.Category
Into   tblCleanData
From   (
    Select ProductID, 
           ProductName, 
           Description, 
           Category, 
           Row_Number() Over (
               Partition By ProductID, 
                            ProductName, 
                            Description, 
                            Category
               Order By     ProductID,
                            ProductName,
                            Description,
                            Category
           ) As RowNumber
    From   MyTable
) As t0
Where t0.RowNumber = 1

Check out http://msdn.microsoft.com/en-us/library/ms186734.aspx, that should get you going in the right direction.

BenAlabaster 2010-07-27 15:44:12

True, but the OP needs a DELETE statement...

OMG Ponies 2010-07-27 15:46:16

@OMG Ponies - Er, good point.

BenAlabaster 2010-07-27 15:54:42

+1 Ben though..

Ankit Rathod 2010-07-27 16:05:36

Answer 4

+12 A:

Try this - it will delete all duplicates from your table:

;WITH duplicates AS
(
    SELECT 
       ProductID, ProductName, Description, Category,
       ROW_NUMBER() OVER (PARTITION BY ProductID, ProductName
                          ORDER BY ProductID) 'RowNum'
    FROM dbo.tblProduct
)
DELETE FROM duplicates
WHERE RowNum > 1
GO

SELECT * FROM dbo.tblProduct
GO

Your duplicates should be gone now: output is:

ProductID   ProductName   DESCRIPTION        Category
   1          Cinthol         cosmetic soap      soap
   1          Lux             cosmetic soap      soap
   1          Crowning Glory  cosmetic soap      soap
   2          Cinthol         nice soap          soap
   3          Lux             nice soap          soap

marc_s 2010-07-27 15:51:55

+1: Drats - beaten

OMG Ponies 2010-07-27 15:55:53

Nice Marc_s, is this a CTE query? If so, is it not necessary in CTE query to have a `union` clause?

Ankit Rathod 2010-07-27 15:58:02

@Nitesh Panchal: yes, CTE's are one of the underused features of SQL Server - as is the OVER() clause :-)

marc_s 2010-07-27 15:59:29

+1: I wasn't sure that you could issue a delete against a CTE like that, and before I could test it you had your answer posted :)

Tom H. 2010-07-27 16:08:26

@Tom H. Even i wasn't sure that Delete could be issued in CTE. I was under the impression that CTE's are only used for recursive queries.

Ankit Rathod 2010-07-27 16:14:45

ansaurus

tags:

views:

answers:

How to delete completely duplicate rows

related questions