ansaurus

Question

delete all but minimal values, based on two columns in SQL Server table

Answer 1

A:

Sorry, I misunderstood the question.


SELECT col1, MIN(col2) as col2
FROM table
GROUP BY col1

Of course returns the rows in question, but assuming you can't alter the table to add a unique identifier, you would need to do something like:


DELETE FROM test
WHERE col1 + '|' + col2 NOT IN
(SELECT col1 + '|' + MIN(col2)
FROM test
GROUP BY col1)

Which should work assuming that the pipe character never appears in your set.

Jason Francis 2009-08-24 13:38:35

Doesn't really answer the question though. OP asked about deleting rows, not selecting them

Simon Nickerson 2009-08-24 13:52:50

Right. My brain isn't in gear yet. I think the correction should work.

Jason Francis 2009-08-24 13:55:34

Answer 2

A:

Ideally, you'd like to be able to say:

DELETE
FROM tbl
WHERE (col1, col2) NOT IN (SELECT col1, MIN(col2) AS col2 FROM tbl GROUP BY col1)

Unfortunately, that's not allowed in T-SQL, but there is a proprietary extension with a double FROM (using EXCEPT for clarity):

DELETE
FROM tbl
FROM tbl
EXCEPT
    SELECT col1, MIN(col2) AS col2 FROM tbl GROUP BY col1

In general:

DELETE
FROM tbl
WHERE col1 + '|' + col2 NOT IN (SELECT col1 + '|' + MIN(col2) FROM tbl GROUP BY col1)

Or other workarounds.

Cade Roux 2009-08-24 13:46:30

Answer 3

+2 A:

This should work for you:

;
WITH NotMin AS
(
    SELECT Col1, Col2, MIN(Col2) OVER(Partition BY Col1) AS TheMin
    FROM Table1
)

DELETE Table1
--SELECT * 
FROM Table1
INNER JOIN NotMin
ON Table1.Col1 = NotMin.Col1 AND Table1.Col2 = NotMin.Col2 
    AND Table1.Col2 != TheMin

This uses a CTE (like a derived table, but cleaner) and the over clause as a shortcut for less code. I also added a commented select so you can see the matching rows (verify before deleting). This will work in SQL 2005/2008.

Thanks, Eric

Strommy 2009-08-24 14:24:22

If using large result-sets, this may not be optimal performance-wise. If that's the case, we can work on a better answer.

Strommy 2009-08-24 17:25:04

I like to use row_number() or rank() for this kind of thing personally... but it's still good and should be accepted.

Rob Farley 2009-08-26 00:53:25

Good point. I'd be interested in seeing your solution in that regard. I always like to see novel uses for the over clause. :-)

Strommy 2009-08-26 03:42:19

ansaurus

tags:

views:

answers:

delete all but minimal values, based on two columns in SQL Server table

related questions