views:

97

answers:

3

I have data that spans multiple months and I want to be able to take the average per day and separate it to the appropriate months. For example, say that one data point is 2/9/2010 - 3/8/2010 and the amount is 1500. Then, the query should return 1071.4 for February 2010 and 428.6 for March. I am hoping there is a MySQL statement that will be able to do the computations instead of my PHP logic. Thanks.

EDIT (added table definition): start (datetime), end (datetime), use

EDIT 2: Here is some dummy data



DROP TABLE IF EXISTS `dummy_data`;
CREATE TABLE `dummy_data` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `start_date` date NOT NULL,
  `end_date` date NOT NULL,
  `data` double(15,4) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=8 DEFAULT CHARSET=latin1;

-- ----------------------------
--  Records of `dummy_data`
-- ----------------------------
BEGIN;
INSERT INTO `dummy_data` VALUES ('1', '2010-01-01', '2010-02-02', '200.0000'), ('2', '2010-02-03', '2010-02-25', '250.0000'), ('3', '2010-02-26', '2010-03-08', '300.0000'), ('4', '2010-03-09', '2010-04-12', '210.0000'), ('5', '2010-04-13', '2010-05-10', '260.0000'), ('6', '2010-05-11', '2010-06-15', '310.0000'), ('7', '2010-06-16', '2010-07-20', '320.0000');
COMMIT;
+1  A: 

You should select SUM and group by the MONTH(date) function as follows:

SELECT SUM(value), MONTH(date)
FROM TABLE
GROUP BY MONTH(date)

EDIT: Ooops, I misread the question, revising my answer now!

You'll need to use some more complex TSQL to get the days in the months, find the average and set into each field then present it in a monthly format

update Create a function to give you days in your range such as this one, which I modified from a function created by Michael Baria

CREATE FUNCTION [dbo].[GetDays](@StartDate DATETIME, @EndDate DATETIME)
RETURNS @MonthList TABLE(DayValue tinyint NOT NULL, MonthValue tinyint NOT NULL, YearValue int NOT NULL)
AS
BEGIN
--Variable used to hold each new date value
DECLARE @DateValue DATETIME

--Start with the starting date in the range
SET @DateValue=@StartDate

--Load output table with the month part of each new date
WHILE @DateValue <= @EndDate
BEGIN
    INSERT INTO @MonthList(DayValue, MonthValue,YearValue)
    SELECT DAY(@DateValue), MONTH(@DateValue), YEAR(@DateValue)

    --Move to the next day           
    SET @DateValue=DATEADD(dd,1,@DateValue)
END

--Return results
RETURN
END

GO

Join your table to this function then sum based on the days

SELECT SUM(data/DATEDIFF(dd,startDate,endDate)), M.MonthValue
FROM TABLE
JOIN (SELECT * FROM [dbo].[GetDays] (startDate,endDate)) M 

GROUP BY M.MonthValue

I can clean this up a bit if I get some sample data

Matthew PK
Matthew, I have added some sample data
Kramer
Thank you. My answer should be a valid solution. Let me know if it requires further clarification
Matthew PK
The question concerns MySQL, and this answer appears to be specific to MS-SQL.
pilcrow
Matthew, are you able to convert your solution to MySQL?
Kramer
A: 

This does separate the amount over the months:

declare @start datetime set @start = '20100209'
declare @end datetime set @end = '20100308'
declare @avg float set @avg = 1500

select
datediff(day, @start, dateadd(day, 1-day(@end), @end)) * @avg / (datediff(day, @start, @end) + 1),
datediff(day, dateadd(day, -day(@end), @end), @end) * @avg / (datediff(day, @start, @end) + 1)

Result:

1071,42857142857    428,571428571429

It gets a bit more complicated though, as you first have to check if the dates are actually in different months, and if the date spans more than two months you need a different approach.

Guffa
+1  A: 

This solution handles [start_date, end_date] spans of as small as one (1) day and as large as twelve (12) months, but is incorrect on a span of thirteen (13) or more months:

CREATE TABLE integers (i INT NOT NULL);

INSERT INTO integers VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9);

CREATE VIEW hundreds AS
   SELECT iii.i * 100 + ii.i * 10 + i AS i
     FROM integers i JOIN integers ii JOIN integers iii;

-- We do not have CTEs, so we create a view
CREATE VIEW spans AS
   SELECT id, start_date, DATEDIFF(end_date, start_date) + 1 AS ndays, data
     FROM dummy_data;

   SELECT spans.id,
          month_name,
          spans.data * COUNT(month_name)/spans.ndays AS month_amount
     FROM spans
LEFT JOIN (SELECT id,
                  MONTH(start_date + INTERVAL i DAY) AS month_num,
                  MONTHNAME(start_date + INTERVAL i DAY) AS month_name
             FROM spans
             JOIN hundreds WHERE i < ndays) daybyday
       ON spans.id = daybyday.id
 GROUP BY id, month_name
 ORDER BY id, month_num;

Output looks like this:

+----+------------+---------------+
| id | month_name | month_amount  |
+----+------------+---------------+
|  1 | January    |  187.87878788 | 
|  1 | February   |   12.12121212 | 
|  2 | February   |  250.00000000 | 
|  3 | February   |   81.81818182 | 
...

We use DATEDIFF to determine the number of days represented by a source record. Then, building off an integers table, we can enumerate the month of each day in a particular span. From there it's a matter of SQL aggregation by record id and month_name.

pilcrow
For some reason, my output is 1 Janurary 60.6,2 Feburary 108.69I cannot get it to replicate your output.Also, I need to be able to span over 13 or more months. Is there a fix to your solution that would allow this?
Kramer
@Kramer, your output problem is because your "integers" table has only values 0 - 9, and I am relying on having 0 - 999, which, I now realize, is not suggested by the link I provided. I'll clarify.
pilcrow
@Kramer, to handle spans of over a year you'd add a `YEAR(start_date + INTERVAL ...)` to the innermost subselect, and GROUP BY and ORDER BY that as well...
pilcrow