views:

105

answers:

4
+1  Q: 

group by problem

SQL group by problem 

I have a SQL group by problem. My table has the following form.

Cust_id.  Price_id     Price. 
----------------------------  
1.          556.        5000. 
-----------------------------  
2.          654.         600. 
2.          432.         487. 
2.          546.         500. 
---------------------------  
3.          455.         200. 
3.          877.         143. 
3.          123.         879. 

Now when I run this query:

Select  cust_id,  max(price) as max, min(price) as min. 
From table. 
Group by cust_id. 

I get.

Cust_id.      Max.        Min. 
1.           5000.       5000. 
2.            600.        487. 
3.            879.        143. 

But what I really want is not the max and min price but the price_id associated with the price.
So the results would be.

Cust_id.       Max.        Min.   
1.             556.        556.   
2.             654.        432.   
3.             123.        877.    

I am at a loss for how to do this. I think that the above query would be a sub query of some sort but that is as far as I got.  

+4  A: 

Use:

   SELECT x.cust_id,
          y.price_id AS max,
          z.price_id AS min
     FROM (SELECT t.cust_id,  
                  MAX(t.price) as max, 
                  MIN(t.price) as min
             FROM TABLE t
         GROUP BY t.cust_id) x
LEFT JOIN TABLE y ON y.cust_id = x.cust_id
                 AND y.price = x.max
LEFT JOIN TABLE z ON z.cust_id = x.cust_id
                 AND z.price = x.min

The problem is that if a cust_id has two records with the same high (or low) price, you'll see duplicates and will need to provide logic to deal with ties.

OMG Ponies
cuts_id vs cust_id? The question is confused...
Jonathan Leffler
@Jonathan Leffler: I used the query the OP provided, building off it.
OMG Ponies
@OMG Ponies - needs a tweak - OP wants the max/min price price_ids, not the prices.
martin clayton
@martin clayton: Corrected, thx.
OMG Ponies
A: 

That should do the trick for engines with ranking/analytic functions:

SELECT Pmin.Cust_id, Pmax.Price_id Price_max_id, Pmin.Price_id Price_min_id FROM
(SELECT t.*, ROW_NUMBER() OVER (PARTITION BY t.Cust_id ORDER BY t.Price DESC) ix FROM @table t) Pmin
JOIN (SELECT t.*, ROW_NUMBER() OVER (PARTITION BY t.Cust_id ORDER BY t.Price ASC) ix FROM @table t) Pmax
    ON Pmin.Cust_id = Pmax.Cust_id
WHERE (Pmin.ix = 1) AND (Pmax.ix = 1)
Lucero
Using RANK can return duplicates; ROW_NUMBER would ensure that duplicates aren't returned.
OMG Ponies
Yep, was fixing that already before I saw your comment. Thanks though.
Lucero
A: 

This is the classic problem that most people who use MySQL have with GROUP BY. MySQL permits queries that are not permitted in standard SQL, and in most other brands of database.

What you need is the full row, including columns other than the cust_id you're grouping by, such that the row has the highest (or lowest) price in that group. You can't get that from GROUP BY.

What I really want is... the price_id associate with the price.

But which price_id do you want, the one from the row with the max price, or the one from the row with the min price? These could be different rows.

Cust_id.  Price_id     Price
----------------------------
2.          654          600  <-- max price, price_id 654
2.          432          487  <-- min price, price_id 432
2.          546          500

What if multiple rows have the same price, but different price_id's? Should it return 654 or 546?

Cust_id.  Price_id     Price
----------------------------
2.          654          600  <-- max price, price_id 654
2.          432          487
2.          546          600  <-- max price, price_id 546

Instead, if you want the price_id of the max and min prices, what you want are two rows: the row such that no other row exists with the same cust_id and a higher price, and the row such that no other row exists with the same cust_id and a lower price.

SELECT tmax.cust_id, tmax.price_id, tmax.price, tmin.price_id, tmin.price
FROM table tmax
JOIN table tmin ON tmax.cust_id = tmin.cust_id
WHERE NOT EXISTS (SELECT * FROM table t1 WHERE t1.cust_id = tmax.cust_id AND t1.price > tmax.price) 
  AND NOT EXISTS (SELECT * FROM table t2 WHERE t2.cust_id = tmin.cust_id AND t2.price > tmin.price) 
Bill Karwin
A: 

Here's a SQL Server approach

with Data as 
(
    select 1 Cust_id, 556 Price_id,  5000  Price union ALL
    select 2,          654,          600 union ALL
    select 2,          432,          487 union ALL
    select 2,          546,          500 union ALL
    select 3,          455,           200 union ALL
    select 3,          877,           143 union ALL
    select 3,          123,           879
),
Prices as
(
    select Cust_id, MAX(Price) MaxP, MIN(Price) MinP
    from Data
    group by Cust_id
)
select Prices.Cust_id
        ,Data.Price MaxPrice
        , d2.Price MinPrice
from Prices
inner join Data  on Data.Cust_id = Prices.Cust_id and Data.Price = Prices.MaxP
inner join Data d2 on d2.Cust_id = d2.Cust_id and d2.Price = Prices.MinP
Scott Weinstein
This also has the duplicates problem.
Lucero