tags:

views:

756

answers:

2

I have a table with some rows in. Every row has a date-field. Right now, it may be duplicates of a date. I need to delete all the duplicates and only store the row with the highest id. How is this possible using a SQL query?

Now:

date      id
'07/07'   1
'07/07'   2
'07/07'   3
'07/05'   4
'07/05'   5

What I want:

date      id
'07/07'   3
'07/05'   5
+6  A: 
DELETE FROM table WHERE id NOT IN
    (SELECT MAX(id) FROM table GROUP BY date);
Georg
Wow, did I go a roundabout way or what? This is definitely the best way to do this.
Eric
I thought your way was a bit too complicated... But honestly, I wanted to do it first using 3 queries instead of just this one.
Georg
This query is also useful for this answer: SELECT date,COUNT(date) AS NumOccurrencesFROM tableGROUP BY date HAVING ( COUNT(date) > 1 )
djangofan
@djangofan: almost, you just hvae to select id instead of COUNT(date).
Georg
A: 

For mysql,postgresql,oracle better way is SELF JOIN.

Postgresql:
DELETE FROM table t1 USING table t2 WHERE t1.date=t2.date AND t1.id<t2.id;

MySQL        
DELETE FROM table
USING table, table as vtable
WHERE (table.id < vtable.id)
AND (table.date=vtable.date)

SQL aggregate (max,group by) functions almost always are very slow.

iddqd