views:

66

answers:

5

HI,

This should be easy but I don't understand enough about how grouping works.

Basically I have 2 tables "Categories" and "Items"

Categories
ID
CategoryName

Items
ID
CategoryID ItemName
Photo
Score

All I want to do is get 1 row for each category which contains the Category ID, the Category Name and the photo that belongs to the highest scoring item.

So I have tried joining the categories to the items and grouping by the CategoryID. Trouble is that I want to order the items so that the highest scoring items are at the top before it does the groupings to make sure that the photo is from the current highest scoring item in that category. If I select MAX(I.score) I can get the highest score but I'm not sure how to get accompanying photo as MAX(photo) will obviously give me the photo with the highest file name alphabetically.

I hope I've explained that well.

+1  A: 

You need to aggregate first and join back like this. (If you change grouping, you need to change JOIN)

SELECT
 ...
FROM
    (
    select
        max(Score) AS MaxScore,
        CategoryID
    FROM
        Items
    GROUP BY
        CategoryID
    ) M
    JOIN 
    Items I ON M.CategoryID = I.CategoryID AND M.MaxScore = I.Score
    JOIN
    Categories C ON I.CategoryID = C.CategoryID
gbn
Bear in mind that this solution will produce more than one result for identically-scored items.
Adam Robinson
Agreed with @Adam - I like the ROW_NUMBER() solution better because you can further dictate in the OVER clause how you want to break ties.
Aaron Bertrand
Correct. However, it does scale better. I've tested both ways (and answered both ways) on all balance the larger the data set the more I'd use an aggregate. Saying that, "joint top" may be the behaviour you want
gbn
+2  A: 

You could try something like (Full example)

DECLARE @Categories TABLE(
    ID INT,
    CategoryName VARCHAR(50)
)

DECLARE @Items TABLE(
    ID INT,
    CategoryID INT,
    ItemName VARCHAR(50),
    Photo VARCHAR(50),
    Score FLOAT
)

INSERT INTO @Categories (ID,CategoryName) SELECT 1, 'Cat1'
INSERT INTO @Categories (ID,CategoryName) SELECT 2, 'Cat2'

INSERT INTO @Items (ID,CategoryID,ItemName,Photo,Score) SELECT 1, 1, 'Item1', 'PItem1', 1
INSERT INTO @Items (ID,CategoryID,ItemName,Photo,Score) SELECT 2, 1, 'Item2', 'PItem2', 2
INSERT INTO @Items (ID,CategoryID,ItemName,Photo,Score) SELECT 3, 1, 'Item3', 'PItem3', 3

INSERT INTO @Items (ID,CategoryID,ItemName,Photo,Score) SELECT 4, 2, 'Item4', 'PItem4', 5
INSERT INTO @Items (ID,CategoryID,ItemName,Photo,Score) SELECT 5, 2, 'Item5', 'PItem5', 2

SELECT  *
FROM    (
            SELECT  c.ID,
                    c.CategoryName,
                    i.Photo,
                    i.Score,
                    ROW_NUMBER() OVER(PARTITION BY i.CategoryID ORDER BY i.Score DESC) RowID
            FROM    @Categories c INNER JOIN
                    @Items i ON c.ID = i.CategoryID
        ) CatItems
WHERE   RowID = 1

Using the ROW_NUMBER you can selet the items you require.

astander
Unfortunately this produces an error - "The OVER SQL construct or statement is not supported."
jonhobbs
Managed to get it to work anyway, it was more of a "warning" than an error.
jonhobbs
+1  A: 

This is a pretty common problem, and one that SQL Server doesn't solve particularly well. Something like this should do the trick, though:

select
    c.ID,
    c.CategoryName,
    item.*

from Categories c

join (
    select 
        ID, 
        CategoryID, 
        ItemName, 
        Photo, 
        Score, 
        (row_number() over order by CategoryID, Score desc) - 
            (rank() over order by CategoryID) as rownum 

        from Items) item on item.CategoryID = c.CategoryID and item.rownum = 0

While there is no explicit group by clause, this (for practical purposes) groups the Categories records and gives you a joined statement that allows you to view any property of the highest scoring item.

Adam Robinson
Right, row_number() starts at 1 though :)
Andomar
@Andomar: Yes, but the first row in a particular rank will have the same value as the `rank()` function, so subtracting them gives you 0 for the first row.
Adam Robinson
A: 

You can use row numbers to rank items per category:

select *
from (
    select 
        row_number() over (partition by c.id order by i.score desc) rn
    ,    *
    from Categories c
    join Items i on c.ID = i.CategoryID
) sub
where rn = 1

In SQL 2005, you can't reference a row_number() directly in a where, so it's wrapped in a subquery.

Andomar
A: 

Exactly as you worded it: "the Category ID, the Category Name and the photo that belongs to the highest scoring item." -- Now here I surmise you really meant "...highest scoring item in that category", no?)

  Select CategoryID, c.Categoryname, Photo
  From items i Join Categoiries c
      On c.ID = i.CategoryId
  Where Score = (Select Max(Score) From Items
                 Where CategoryID = i.CategoryId)

If you really meant the highest scoring item on the whole items table, then just omit the predicate in the subquery

  Select CategoryID, c.Categoryname, Photo
  From items i Join Categoiries c
      On c.ID = i.CategoryId
  Where Score = (Select Max(Score) From Items)

Both these queries will return multiple rows per group if there are more than one item in the defined group which tie for highest score..

Charles Bretana