views:

852

answers:

3

MS Access has a button to generate sql code for finding duplicated rows. I don't know if SQL Server 2005/2008 Managment Studio has this.

  1. If it has, please point where

  2. If it has not, please tell me how can I have a T-SQL helper for creating code like this.

+6  A: 

Well, if you have entire rows as duplicates in your table, you've at least not got a primary key set up for that table, otherwise at least the primary key value would be different.

However, here's how to build a SQL to get duplicates over a set of columns:

SELECT col1, col2, col3, col4
FROM table
GROUP BY col1, col2, col3, col4
HAVING COUNT(*) > 1

This will find rows which, for columns col1-col4, has the same combination of values, more than once.

For instance, in the following table, rows 2+3 would be duplicates:

PK    col1    col2    col3    col4    col5
1       1       2       3       4      6
2       1       3       4       7      7
3       1       3       4       7      10
4       2       3       1       4      5

The two rows share common values in columns col1-col4, and thus, by that SQL, is considered duplicates. Expand the list of columns to contain all the columns you wish to analyze this for.

Lasse V. Karlsen
You has a point, because the code is not as difficult as I expected. In other SQL languages it can be hard to code manually
Jader Dias
Shouldn't be, this is standard SQL, nothing specific to T-SQL. It should be the same for MySQL, SQLite, Oracle, Sybase, DB2, etc.
Lasse V. Karlsen
You're right. Lack of syntax highlighting and noisy code made me believe that the MS Access generated code was difficult to understand and I didn't even try before.
Jader Dias
A: 

AFAIK, it doesn't. Just make a select statement grouping by all the fields of a table, and filtering using a having clause where the count is greater than 1.

If your rows are duplicated except by the key, then don't include the key in the select fields.

eKek0
+2  A: 

If you're using SQL Server 2005+, you can use the following code to see all the rows along with other columns:

SELECT *, ROW_NUMBER() OVER (PARTITION BY col1, col2, col3, col4 ORDER BY (SELECT 0)) AS DuplicateRowNumber
FROM table

Youd can also delete (or otherwise work with) duplicates using this technique:

WITH cte AS
(SELECT *, ROW_NUMBER() OVER (PARTITION BY col1, col2, col3, col4 ORDER BY (SELECT 0)) AS DuplicateRowNumber
    FROM table
)
DELETE FROM cte WHERE DuplicateRowNumber > 1

ROW_NUMBER is extremely powerful - there is much you can do with it - see the BOL article on it at http://msdn.microsoft.com/en-us/library/ms186734.aspx

Mike DeFehr