views:

297

answers:

6

Hey everyone,

Trying to work on a query that will return the top 3 selling products with the three having a distinct artist. Im getting stuck on getting the unique artist.

Simplified Table schema

Product
  ProductID
  Product Name
  Artist Name

OrderItem
 ProductID
 Qty


So results would look like this...

PID    artist                 qty
34432, 'Jimi Hendrix',        6543
54833, 'stevie ray vaughan'   2344
12344, 'carrie underwood',    1
A: 

Try this

Select top 3 artist, count(artist) from tablename group by artist order by artist count(artist) desc

+1  A: 

If I have understood your schema correctly, you should be able to do it like this:

select top 3 * from(
   select p.ProductId, p.ArtistName, sum(o.qty) as qty from Product p, OrderItem o
   where p.ProductId = o.ProductId
   group by p.productId, p.ArtistName
   order by sum(o.qty)
)
klausbyskov
tried running it but get this error. Msg 1033, Level 15, State 1, Line 7The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP or FOR XML is also specified.
chopps
A: 

Analyzing your request, it sounds like the results should be the highest product quantity for the top three artists. So, if Jimi Hendrix has the top 10 product quantities and Stevie Ray Vaughan is 11th, you want Jimi with his highest product then Stevie with his highest product.

With ProductRanksForArtists As
    (
    Select P.ProductId, P.ArtistName, Sum(O.Qty) As Total
        , ROW_NUMBER OVER( PARTITION BY P.ArtistName ORDER BY Sum(O.Qty) DESC ) As ProductRank
    From Product As P
        Join OrderItem As O
            On O.ProductId = P.ProductId
    Group By P.ProductId, P.ArtistName
    )
    , HighestProductForArtists As 
    (
    Select ProductId, ArtistName, Total
        , ROW_NUMBER OVER( ORDER BY Total DESC ) As TotalRank
    From ProductRanksForArtists
    Where ProductRank = 1
    )
Select ProductId, ArtistName, Total
From HighestProductForArtists
Where TotalRank <= 3
Thomas
still get back two of the same artists in the three records.
chopps
@chopps - Revised my query.
Thomas
+1  A: 

I don't know what you want to do if an Artist has two top-ranked products with identical sales--this will return two in case of a tie.

If you want to add another criteria, such as "most recent", you have to add that to both subqueries.

select top 3 sales_by_item.ProductID, 
             sales_by_item.Artist, 
             sales_by_item.Qty 
from 
(
    select * from product x
    inner join OrderItem y 
    on x.productid = y.productid
    group by productid, Artist
) sales_by_item
inner join
(
    select artist, max(qty) as maxqty
    from product x
    inner join OrderItem y 
    on x.productid = y.productid
    group by artist
) max_by_artist
on sales_by_item.artist = max_by_artist.artist
    and sales_by_item.qty = max_by_artist.maxqty
order by sales_by_item.qty

Edited to make subquery names more descriptive

egrunin
+2  A: 

Use this:

with summed_sales_of_each_product as 
(
    select p.artist_name, p.product_id, sum(i.qty) as total
    from product p join order_item i 
    on i.product_id = p.product_id
    group by p.artist_name, p.product_id
),
each_artist_top_selling_product as
(
    select x_in.artist_name, x_in.product_id, x_in.total 
    from summed_sales_of_each_product x_in where total = 
        (select max(x_out.total) 
            from summed_sales_of_each_product x_out 
            where x_out.artist_name = x_in.artist_name)
)
select top 3
artist_name, product_id, total
from each_artist_top_selling_product
order by total desc

But you cannot stop at that query, how about if there are two products on one artist that are ties on highest selling? This is how the data like this...

beatles  yesterday       1000
beatles  something       1000
elvis    jailbreak rock  800
nirvana  lithium         600
tomjones sexbomb         400

...will result to following using the above query:

beatles  yesterday       1000
beatles  something       1000
elvis    jailbreak rock  800

Which one to choose? yesterday or something? Since you cannot arbitrarily chose one over the other, you must list both. Also, what if the top 10 highest selling belongs to beatles and are ties, each with a quantity of 1000? Since that is the very best thing you are avoiding(i.e. reporting same artist on top 3), you have to amend the query so the top 3 report will look like this:

beatles  yesterday       1000
beatles  something       1000
elvis    jailbreak rock  800
nirvana  lithium         600

To Amend:

with summed_sales_of_each_product as 
(
    select p.artist_name, p.product_id, sum(i.qty) as total
    from product p join order_item i 
    on i.product_id = p.product_id
    group by p.artist_name, p.product_id
),
each_artist_top_selling_product as
(
    select x_in.artist_name, x_in.product_id, x_in.total 
    from summed_sales_of_each_product x_in 
    where x_in.total = 
        (select max(x_out.total) 
            from summed_sales_of_each_product x_out 
            where x_out.artist_name = x_in.artist_name)
),
top_3_total as
(    
    select distinct top 3 total 
    from each_artist_top_selling_product
    order by total desc
)
select artist_name, product_id, total 
from each_artist_top_selling_product
where total in (select total from top_3_total)
order by total desc

How about if the beatles has another product which has 900 qty? Will the above query still work? Yes, it will still work. Since the top_3 CTE only concerns itself from the already filtered top qty on each artist. So this source data...

beatles  yesterday       1000
beatles  something       1000
beatles  and i love her  900
elvis    jailbreak rock  800
nirvana  lithium         600
tomjones sexbomb         400

...will still result to following:

beatles  yesterday       1000
beatles  something       1000
elvis    jailbreak rock  800
nirvana  lithium         600
Michael Buen
Very nice Michael! Thanks everyone for the help on this.
chopps
A: 

Second attempt. I’m not in a position to test this code, and I’m not sure if I’ve got that “partition by” clause configured correctly. The idea is:

  • The inner query gets the sum of Qty for all product/artists, and uses the row_number() function to number them starting with the largest, and resets the ordering for each artist. (This can be done, but my syntax may be off.)
  • The outer query picks out the first (largest) item for each artist, and returns only the first three (ordere by Qty)
  • If an artists top two products tie for total Qty, I arbitrarily break the tie in favor of the “earliest” album.

(I try to avoid using "Top n", but it's late and I don't want to tackle another row_number() function.)

SELECT top 3 
  ProductId
  ,ArtistName
  ,Qty
 from (--  Products + Artists by total qty
       select
         pr.ProductId
        ,pr.ArtistName
        ,sum(oi.Qty) Qty
        ,row_number() over (partition by pr.ArtistName order by pr.ArtistName, sum(oi.Qty) desc, pr.ProductId) Ranking
       from Product pr
        inner join OrderItem oi
         on oi.ProductID = pr.ProductID
       group by pr.ProductId, pr.ArtistName) BestSellers
 where Ranking = 1
 group by ProductId, ArtistName) BestArtists
 order by Qty desc
Philip Kelley
No, hold on, this (v1) isn't going to work...
Philip Kelley
Ok, v2 looks better
Philip Kelley