ansaurus

Question

How to Delete Duplicate Rows in SQL 2000?

Answer 1

+2 A:

The trick is using the Primary Key column (you do have one, correct?) and simply finding the first PK value that matches the criteria you want. If for some crazy reason you do not have a primary key column, then add an Identity column and make it the primary key and then do the delete.

EDIT Revised to make it more generic. If you remove the final filter on ScoreTest, it will remove all duplicates based on ScoreStudentId, ScoreAdvisor and ScoreCorrect.

Delete Scores
Where Exists    (
                Select 1
                From Scores As S2
                Where S2.ScoreStudentId = Scores.ScoresStudentId
                        And S2.ScoreAdvisor = Scores.ScoreAdvisor
                        And S2.ScoreCorrect = Scores.ScoreCorrect
                Group By S2.ScoreStudentId, S2.ScoreAdvisor, S2.ScoreCorrect
                Having Count(*) > 1
                    And Min(S2.PrimaryKeyColumn) = Scores.PrimaryKeyColumn
                )
    And Scores.ScoreTest = 3284

Thomas 2010-04-21 21:49:11

Thanks Thomas! This works great. So this selects 1 from the table joined to itself, does the group by and the count greater than 1. So this would work with any number of duplicates and will always be left with 1 unique record? Can someone explain the 'MIN'?

Mikecancook 2010-04-22 13:44:30

@Mikecancook - RE: Min, we need to choose one of the duplicates to remove. In this case, I'm choosing the duplicate with the lowest PK number. I could have just as easily used Max and removed the duplicate with the highest PK.

Thomas 2010-04-22 14:49:00

I must be missing something because I still can't get this to work. It just says 0 rows affected. I'll edit my post to reflect my changes.

Mikecancook 2010-04-22 16:14:19

@Mikecancook - If you change the `Delete Scores` part of the query to `Select From Scores` and run it (given what you said, you should get no rows). Now comment the `And Min(S2.PrimaryKeyColumn) = Scores.PrimaryKeyColumn` and see if you get anything. If not, then the question is whether there are still duplicates on those three columns.

Thomas 2010-04-22 17:09:27

@Thomas - Ok, so I think I figured out what the problem is. Apparently, NewScoreID is a varchar datatype and not the INT PK I thought it was. I messed up the table moving it from a production environment to a testing environment. As usual, it was pilot error.

Mikecancook 2010-04-22 17:50:48

ansaurus

tags:

views:

answers:

How to Delete Duplicate Rows in SQL 2000?

related questions