views:

96

answers:

3

In PostgreSQL 8.3, let's say I have a table called widgets with the following:

 id  | type | count
--------------------
  1  | A    |   21
  2  | A    |   29
  3  | C    |    4
  4  | B    |    1
  5  | C    |    4
  6  | C    |    3
  7  | B    |   14

I want to remove duplicates based upon the type column, leaving only those with the highest count column value in the table. The final data would look like this:

 id  | type | count
--------------------
  2  | A    |   29
  3  | C    |    4  /* `id` for this record might be '5' depending on your query */
  7  | B    |   14

I feel like I'm close, but I can't seem to wrap my head around a query that works to get rid of the duplicate columns.

+2  A: 

count is a sql reserve word so it'll have to be escaped somehow. I can't remember the syntax for doing that in Postgres off the top of my head so I just surrounded it with square braces (change it if that isn't correct). In any case, the following should theoretically work (but I didn't actually test it):

delete from widgets where id not in (
select max(w2.id) from widgets as w2 inner join 
(select max(w1.[count]) as [count], type from widgets as w1 group by w1.type) as sq
on sq.[count]=w2.[count] and sq.type=w2.type group by w2.[count]
);
Asaph
Thanks, that did the trick! :) I wasn't using "count" as the actual column name, this is just a dumbed down example that gets to the heart of what I'm trying to do. I appreciate the help!
Matt Huggins
@Matt Huggins: You're welcome :)
Asaph
A: 

According to your requirements, seems to me that this should work:

DELETE 
FROM widgets
WHERE type NOT IN 
(
    SELECT type, MAX(count) 
    FROM widgets 
    GROUP BY type
)
David Elizondo
If I am right, your query don't delete anything!
Erlock
+1  A: 

There is a slightly simpler answer than Asaph's, with EXISTS SQL operator :

DELETE FROM widgets AS a 
WHERE EXISTS 
  (SELECT * FROM widgets AS b 
   WHERE (a.type = b.type AND b.count > a.count) 
          OR (b.id > a.id AND a.type = b.type AND b.count = a.count))

EXISTS operator returns TRUE if the following SQL statement returns at least one record.

Erlock