tags:

views:

106

answers:

4

Hi everyone, I'm having an odd problem

I have a table with the columns product_id, sales and day

Not all products have sales every day. I'd like to get the average number of sales that each product had in the last 10 days where it had sales

Usually I'd get the average like this

SELECT product_id, AVG(sales) 
FROM table 
GROUP BY product_id

Is there a way to limit the amount of rows to be taken into consideration for each product?

I'm afraid it's not possible but I wanted to check if someone has an idea

Update to clarify:

Product may be sold on days 1,3,5,10,15,17,20. Since I don't want to get an the average of all days but only the average of the days where the product did actually get sold doing something like

SELECT product_id, AVG(sales) 
FROM table 
WHERE day > '01/01/2009' 
GROUP BY product_id

won't work

A: 

Give this a whirl. The sub-query selects the last ten days of a product where there was a sale, the outer query does the aggregation.

SELECT t1.product_id, SUM(t1.sales) / COUNT(t1.*) 
FROM table t1
   INNER JOIN (
               SELECT TOP 10 day, Product_ID
               FROM table t2
               WHERE (t2.product_ID=t1.Product_ID)
               ORDER BY DAY DESC 
               ) 
   ON (t2.day=t1.day) 

GROUP BY t1.product_id

BTW: This approach uses a correlated subquery, which may not be very performant, but it should work in theory.

JohnFx
Not quite...It has to be the last 10 days in each group in which sales occurred.
Robert Harvey
That would of course work if the products were sold continouslybut a product might be sold only on days 1,3,4,5,10,13,20if I do something like where day > '9/7/2009'I'm not gonna get the last 10 days where it did have sales, but just the last 10 days in general.I hope the clarifies it a bit
MarcS
Oh, okay. I see now. Let me see if I can rework the query. It is definitely doable. I'll update my answer in a few minutes.One more question first, is it possible to have two rows for a product on the same day in your table?
JohnFx
thanks for your help JohnFxno that is not possiblefor any given day, there will only be one row per product
MarcS
Modified code to clarified requirement.
JohnFx
I'm not sure many servers support inner joining dependent subqueries (ie. that reference the outer query to filter). Because the subquery is dependent you have to use `APPLY`.
Remus Rusanu
thank you that should workThe performance is gonna suck of course but I doubt that there is another way
MarcS
@Remus: He didn't specify the DB he was using. I'll adjust as necessary if he qualifies.
JohnFx
I'm stuck with MySQL actually guess I can use your approach and modify it so it'll work
MarcS
@Marc: I didn't know MySQL supports that actually.
Remus Rusanu
A: 

I'm not sure if I get it right but If you'd like to get the average of sales for last 10 days for you products you can do as follows :

SELECT Product_Id,Sum(Sales)/Count(*) FROM (SELECT ProductId,Sales FROM Table WHERE SaleDAte>=@Date) table GROUP BY Product_id HAVING Count(*)>0

OR You can use AVG Aggregate function which is easier :

SELECT Product_Id,AVG(Sales) FROM (SELECT ProductId,Sales FROM Table WHERE SaleDAte>=@Date) table GROUP BY Product_id

Updated

Now I got what you meant ,As far as I know it is not possible to do this in one query.It could be possible if we could do something like this(Northwind database):

select a.CustomerId,count(a.OrderId) 
from Orders a INNER JOIN(SELECT CustomerId,OrderDate FROM Orders Order By OrderDate) AS b ON a.CustomerId=b.CustomerId GROUP BY a.CustomerId Having count(a.OrderId)<10

but you can't use order by in subqueries unless you use TOP which is not suitable for this case.But maybe you can do it as follows:

SELECT PorductId,Sales INTO #temp FROM table Order By Day

    select a.ProductId,Sum(a.Sales) /Count(a.Sales)
    from table a INNER JOIN #temp AS b ON a.ProductId=b.ProductId GROUP BY a.ProductId Having count(a.Sales)<=10
Beatles1692
I didn't know you were using MySql#temp is notation of temporary tables in SQL Server.MySql has temp tables too but I don't know the syntax
Beatles1692
+1  A: 

If you want the last 10 calendar day since products had a sale:

SELECT product_id, AVG(sales)
FROM table t
JOIN (
   SELECT product_id, MAX(sales_date) as max_sales_date
   FROM table
   GROUP BY product_id
) t_max ON t.product_id = t_max.product_id 
  AND  DATEDIFF(day, t.sales_date, t_max.max_sales_date) < 10
GROUP BY product_id;

The date difference is SQL server specific, you'd have to replace it with your server syntax for date difference functions.

To get the last 10 days when the product had any sale:

SELECT product_id, AVG(sales)
FROM (
    SELECT product_id, sales, DENSE_RANK() OVER 
           (PARTITION BY product_id ORDER BY sales_date DESC) AS rn
    FROM Table
) As t_rn
WHERE rn <= 10
GROUP BY product_id;

This asumes sales_date is a date, not a datetime. You'd have to extract the date part if the field is datetime.

And finaly a windowing function free version:

SELECT product_id, AVG(sales)
FROM Table t
WHERE sales_date IN (
 SELECT TOP(10) sales_date 
 FROM Table s
 WHERE t.product_id = s.product_id
 ORDER BY sales_date DESC)
GROUP BY product_id;

Again, sales_date is asumed to be date, not datetime. Use other limiting syntax if TOP is not suported by your server.

Remus Rusanu
A: 

If this is a table of sales transactions, then there should not be any rows in there for days on which there were no Sales. I.e., If ProductId 21 had no sales on 1 June, then this table should not have any rows with productId = 21 and day = '1 June'... Therefore you should not have to filter anything out - there should not be anything to filter out

Select ProductId, Avg(Sales) AvgSales
From Table 
Group By ProductId

should work fine. So if it's not, then you have not explained the problem completely or accurately.

Also, in yr question, you show Avg(Sales) in the example SQL query but then in the text you mention "average number of sales that each product ... " Do you want the average sales amount, or the average count of sales transactions? And do you want this average by Product alone (i.e., one output value reported for each product) or do you want the average per product per day ?

If you want the average per product alone, for just thpse sales in the ten days prior to now? or the ten days prior to the date of the last sale for each product? If the latter then

Select ProductId, Avg(Sales) AvgSales
From Table T
Where day > (Select Max(Day) - 10
             From Table
             Where ProductId = T.ProductID)
Group By ProductId

If you want the average per product alone, for just those sales in the ten days with sales prior to the date of the last sale for each product, then

Select ProductId, Avg(Sales) AvgSales
From Table T
Where (Select Count(Distinct day) From Table
       Where ProductId = T.ProductID
          And Day > T.Day) <= 10
Group By ProductId
Charles Bretana