views:

128

answers:

2

As the title suggests, I'd like to select the first row of each set of rows grouped with a GROUP BY.

Specifically, if I've got a "purchases" table that looks like this:

> SELECT * FROM purchases:
id | customer | total
 1 | Joe      | 5
 2 | Sally    | 3
 3 | Joe      | 2
 4 | Sally    | 1

I'd like to query for "the id of the largest purchase made by each customer. Something like this:

> SELECT FIRST(id), customer, FIRST(total)
. FROM purchases
. GROUP BY customer
. ORDER BY total DESC;
FIRST(id) | customer | FIRST(total)
        1 | Joe      | 5
        2 | Sally    | 3
A: 

Select Top 1 .....

griegs
1) TOP is SQL Server only 2) Only the first record returned, doesn't reproduce the output the OP lists
OMG Ponies
As @OMG said, this will only return the first row — I was the first row in each set of group by'd rows (see the example)
David Wolever
+9  A: 

On Oracle 8i+/SQL Server 2005+/PostgreSQL 8.4+:

WITH summary AS (
    SELECT p.id, 
           p.customer, 
           p.total, 
           ROW_NUMBER() OVER(PARTITION BY t.customer 
                                 ORDER BY t.total DESC) AS rk
      FROM PURCHASES p)
SELECT s.*
  FROM summary s
 WHERE s.rk = 1

Supported by any database:

But you need to add logic to break ties:

  SELECT MIN(x.id),  -- change to MAX if you want the highest
         x.customer, 
         x.total
    FROM PURCHASES x
    JOIN (SELECT p.customer,
                 MAX(total) AS max_total
            FROM PURCHASES p
        GROUP BY p.customer) y ON y.customer = x.customer
                              AND y.max_total = x.total
GROUP BY x.customer, x.total
OMG Ponies
@vol7ron: Both versions reproduce the output the OP identified, how is that "retarded"? I gave an entirely correct answer, and I commented (read: NOT DOWNVOTED) about why your answer was incorrect.
OMG Ponies
I took my downvote away, but still retarded for not identifying a fully qualified solution from a helpful answer, which is based on the question. Everyone here answered the question asked, not the question that should have been asked as your `WITH` solution addresses.
vol7ron
@vol7ron: You didn't reverse your downvote; I have to edit the answer before you can. Anyone with enough rep can expand the vote counter to see there's still two downvotes. The fact is you blamed me for others downvoting your incomplete answer, and either Tim or griegs agreed with you.
OMG Ponies
Forgive my ignorance, but how in what way is this not a "fully qualified solution"? And what is "the question that should have been asked"?
David Wolever
[@OMG Ponies:](http://stackoverflow.com/users/135152/omg-ponies) yes I did, though, I didn't upvote your answer. I can see the +/- and all that means is that 2 others disagreed with your answer. BTW, you can vote and un-vote in a certain period of time w/o an edit by the original question. I think if I wanted to up-vote it now too much time has passed and you are correct an edit would be needed now.
vol7ron
Thanks for the reply, @OMG. I was hoping it would be possible without subqueries… But I guess not. Anyway, could the second ("supported in any database") version be trivially modified to deal with duplicate totals (eg, so only one is returned)? Or would a similar sub query need to be used?
David Wolever
@David Wolever: How would you like to decide ties, should they be encountered? ROW_NUMBER ensures it's not a concern...
OMG Ponies
[@David Wolever:](http://stackoverflow.com/users/71522/david-wolever) The question was about limiting, which `LIMIT`(PgSQL), `TOP` (Oracle, SQL Server, etc), or `FETCH FIRST 1 ROWS ONLY` (DB2) can be used correctly in a subquery. Using a `WITH` block is limited to later RDBMS systems, but more importantly `GROUP BY` as noted, is not necessary.
vol7ron
@OMG: the first version has ROW_NUMBER, but the second doesn't (so, eg, if "Joe" made two orders with a total of `5`, they would both be returned). Any method for resolving it is OK (eg, `MIN(id)`) — I'm just curious at this point.
David Wolever
@David Wolever: Yes, using MIN/MAX and grouping by the other columns is the easiest way to break ties in the second option.
OMG Ponies
BTW, I would like to add that when I say "retarded" I use that term loosely. I don't get mad at SO or any members from it. As OMG Ponies should know, I always appreciate his questions/answers/contributions. I just re-read what I wrote and it sounded mean, so I wanted to clarify.
vol7ron
@OMG D'oh. Of course. Thanks again for the help.
David Wolever
@OMG don't delete anything, it's not worth it. BTW I didn't have a downvote for you, as I stated, I already took it away. If a downvote is removed, then it was one of the 2 other people that downvoted your answer. I did, however just upvote your answer. Update: it looks like someone did remove one of the downvotes after posting this; perhaps someone is following my comments - secret admirer: please identify yourself :)
vol7ron
@vol7ron: As you wish, thank you.
OMG Ponies