views:

4403

answers:

7
+6  Q: 

Max of Sum in SQL

I have a list of stores, departments within the stores, and sales for each department, like so (created using max(sales) in a subquery, but that's not terribly important here I don't think):

toronto    baskets 500
vancouver  baskets 350
halifax    baskets 100
toronto    noodles 275
vancouver  noodles 390
halifax    noodles 120
halifax    fish    200

I would like to ask for the highest-selling department at each store. The results should look like this:

toronto    baskets 500
vancouver  noodles 275
halifax    fish    200

Whenever I use GROUP BY, it includes all the listings from my subquery. Is there a nice clean way to do this without a temporary table?

+2  A: 

This works in Oracle, other implementations may have different syntax for analytic functions (or lack them entirely):

select store
     , max(department) keep(dense_rank last order by sales)
     , max(sales)
  from (
        ...query that generates your results...
       )
 group by store
Noah Yetter
Yeah, I was going to propose that, but I think he's on SQL Server (as anyone using oracle would've specified...)
TheSoftwareJedi
+1  A: 

This will work in SQL Server, as of 2005:

with data as
(select store, department, sales
from <your query>),
 maxsales as
(select store,  sales = max(sales)
from data
group by store)
select store, (select top 1 department from data where store = t.store and sales = t.sales order by [your criteria for ties]), sales
from maxsales m

I'm assuming you only want to display 1 department in the event of ties, hence the top 1 and [your criteria for ties] to distinguish among them.

Jeffrey Meyer
A: 

Maybe this could work. Haven't tried it though, there could be a better solution...

select yourTable.store, dept, sales
from yourTable
join (
  select store, max(sales) as maxSales from yourTable group by store
) tempTable on tempTable.store = yourTable.store 
           and tempTable.maxSales = yourTable.sales
Mr. Brownstone
Oops, I posted a similar solution a few mins late.This query won't run as posted. The ) before group by has to go, the max(sales) in the tempTable has no name and the columns in the select need to specify their source. Don't want to be anal, but if someone comes along later, I wanted it to be clear.
Pete
ok, I fixed it, thanks for noticing
Mr. Brownstone
+3  A: 

This works in Sql Server (2000 and above for sure)

SELECT a.Store, a.Department, a.Sales
FROM temp a
INNER JOIN 
(SELECT store, max(sales) as sales
FROM temp
GROUP BY Store) b
ON a.Store = b.Store AND a.Sales = b.Sales;
Pete
A: 

This will work in SQL Server without temp tables:

SELECT Store, Department, Sales FROM
(SELECT Store, Department, Sales,
DENSE_RANK()  OVER (PARTITION BY Store
ORDER BY Sales DESC) AS Dense_Rank
FROM Sales) A WHERE Dense_Rank = 1

WHERE "Sales" = your original query

Turnkey
This solution is SQL 2005 and above only.
Sean Carpenter
+1  A: 

This will work

Select Store, Department, Sales
From yourTable A
Where Sales = (Select Max(Sales)
               From YourTable
               Where Store = A.Store)
Charles Bretana
+1  A: 

My 2 solutions for SQL 2005 is below. The other ones I can see so far may not return the correct data if two of the sales figures are the same. That depends on your needs though.

The first uses the Row_Number() function, all the rows are ranked from the lowest to the highest sales (then some tie breaking rules). Then the highest rank is chosen per store to get the result.

You can try adding a Partion By clause to the Row_Number function (see BOL) and/or investigate using a inner join instead of an "in" clause.

The second, borrowing on Turnkey's idea, again ranks them, but partitions by store, so we can choose the first ranked one. Dense_Rank will possibly give two identical rows the same rank, so if store and department were not unique, it could return two rows. With Row_number the number is unique in the partition.

Some things to be aware of is that this may be slow, but would be faster for most data sets than the sub-query in one of the other solutions. In that solution, the query would have to be run once per row (including sorting etc), which could result in a lot of queries.

Other queries the select the max sales per store and return the data that way, return duplicate rows for a store if two departments happen to have the same sales. The last query shows this.

DECLARE @tbl as TABLE (store varchar(20), department varchar(20), sales int)

INSERT INTO @tbl VALUES ('Toronto', 'Baskets', 500)
INSERT INTO @tbl VALUES ('Toronto', 'Noodles', 500)
INSERT INTO @tbl VALUES ('Toronto', 'Fish', 300)
INSERT INTO @tbl VALUES ('Halifax', 'Fish', 300)
INSERT INTO @tbl VALUES ('Halifax', 'Baskets', 200)

-- Expect Toronto/Noodles/500 and Halifax/Fish/300

;WITH ranked AS -- Rank the rows by sales from 1 to x
(
    SELECT 
     ROW_NUMBER() OVER (ORDER BY sales, store, department) as 'rank', 
     store, department, sales
    FROM @tbl
)

SELECT store, department, sales
FROM ranked
WHERE rank in (
    SELECT max(rank) -- chose the highest ranked per store
    FROM ranked
    GROUP BY store
)

-- Another way
SELECT store, department, sales
FROM (
    SELECT 
     DENSE_RANK() OVER (PARTITION BY store ORDER BY sales desc, 
store desc, department desc) as 'rank',
     store, department, sales
    FROM @tbl
) tbl
WHERE rank = 1


-- This will bring back 2 rows for Toronto
select tbl.store, department, sales
from @tbl tbl
    join (
     select store, max(sales) as maxSales from @tbl group by store
    ) tempTable on tempTable.store = tbl.store 
           and tempTable.maxSales = tbl.sales
Robert Wagner