views:

122

answers:

5

hi! I've a table(TableA) with contents like this:

Col1
-----
 A
 B
 B
 B
 C 
 C
 D

i want to remove just the duplicate values without using temporary table in Microsoft SQL Server. can anyone help me? the final table should look like this:

Col1
-----
 A
 B
 C 
 D

thanks :)

+2  A: 

Can you use the row_number() function (http://msdn.microsoft.com/en-us/library/ms186734.aspx) to partition by the columns you're looking for dupes on, and delete where row number isn't 1?

brydgesk
Doesn't work - if you do that, you only have the 'A', 'B' and 'C' values to work with - when you delete all 'B' with row_number > 1, it will delete **all** instances of 'B' since there's no way to distinguish them if there are no other columns in the data set...
marc_s
"The world is moving so fast these days that the man who says it can't be done is generally interrupted by someone doing it." brydgesk could have been the accepted answer if he was not discouraged to materialize his idea. brydgesk could have tried doing it
Hao
Nah, I wasn't discouraged. I just didn't feel this was a situation where the OP needed to be given code. All he needed was a concept. I've used row_number() for this exact purpose so I knew it was usable. Thanks for the support Hao.
brydgesk
A: 

I completely agree that having a unique identifier will save you a lot of time.

But if you can't use one (or if this is purely hypothetical), here's an alternative: Determine the number of rows to delete (the count of each distinct value -1), then loop through and delete top X for each distinct value.

Note that I'm not responsible for the number of kittens that are killed every time you use dynamic SQL.

declare @name varchar(50)
declare @sql varchar(max)
declare @numberToDelete varchar(10) 
declare List cursor for
    select name, COUNT(name)-1 from #names group by name
OPEN List
FETCH NEXT FROM List 
INTO @name,@numberToDelete
WHILE @@FETCH_STATUS = 0
BEGIN
  IF @numberToDelete > 0
  BEGIN
    set @sql = 'delete top(' + @numberToDelete + ') from #names where name=''' + @name + ''''
    print @sql
    exec(@sql)
  END
  FETCH NEXT FROM List INTO @name,@numberToDelete
END
CLOSE List
DEALLOCATE List

Another alternative would to be create a view with a generated identity. In this way you could map the values to a unique identifer (allowing for conventional delete) without making a permanent addition to your table.

seraphym
A: 

Select grouped data to temp table, then truncate original, after that move back it to original.

Second solution, I am not sure will it work but you can try open table directly from SQL Management Studio and use CTRL + DEL on selected rows to delete them. That is going to be extremely slowly because you need to delete every single row by hands.

adopilot
+3  A: 
WITH TableWithKey AS (
SELECT ROW_NUMBER() OVER (ORDER BY Col1) As id, Col1 As val
FROM TableA
)
DELETE FROM TableWithKey WHERE id NOT IN
(
SELECT MIN(id) FROM TableWithKey
GROUP BY val
)
ewwwyn
This is a CTE, not a temp table, if that's acceptable?
ewwwyn
+1 interesting approach! And it works indeed. But still - why anyone would want to do this to himself is beyond me.... :-)
marc_s
A: 

You can remove duplicate rows using a cursor and DELETE .. WHERE CURRENT OF.

CREATE TABLE Client ([name] varchar(100))
INSERT Client VALUES('Bob')
INSERT Client VALUES('Alice')
INSERT Client VALUES('Bob')
GO
DECLARE @history TABLE (name varchar(100) not null)
DECLARE @cursor CURSOR, @name varchar(100)
SET @cursor = CURSOR FOR SELECT name FROM Client
OPEN @cursor
FETCH NEXT FROM @cursor INTO @name
WHILE @@FETCH_STATUS = 0
BEGIN
    IF @name IN (SELECT name FROM @history)
        DELETE Client WHERE CURRENT OF @cursor
    ELSE
        INSERT @history VALUES (@name)

    FETCH NEXT FROM @cursor INTO @name
END
Anthony Faull