Related to a question I asked earlier here, I've found a problem which is eluding me (obviously).
The original question was how to select a min and max date from a daily table based on a monthly table where some daily table dates could be missing. Basically what I needed was columns containing the month date (always the first), the earliest date for that month in the daily table and the latest date for that month in the daily table.
So, if the last week of January and first week of February were missing from the daily table (and we otherwise had all the dates for January and February but no more), I needed:
MonthStart DayFirst DayLast
---------- ---------- ----------
2009-01-01 2009-01-01 2009-01-24
2009-02-01 2009-02-08 2009-02-28
The answer was:
select
m.date as m1,
min(d.date) as m2,
max(d.date) as m3
from monthly m
join daily d
on month(d.date) = month(m.date)
and year(d.date) = year(m.date)
group by m.date
order by m.date
which worked for the specs I gave.
Unfortunately, reality bites, and there are multiple records in the monthly table (and daily table) with the same date. Specifically:
- the dates are
2007-10-16
thru2007-10-30
(15 days),2007-11-01
thru2007-11-30
(30 days) and2007-12-01
thru2007-12-15
(15 days). - each date has six rows in both tables (because they each have a row for three system names and two periods.
The problem is that I sum()
a field in the monthly table and the new query is getting values that are much too large (compared to the previous query which did not have the join).
The aggregation changes the query to be:
select
m.date as m1,
sum(m.other_field), -- added this
min(d.date) as m2,
max(d.date) as m3
from monthly m
join daily d
on month(d.date) = month(m.date)
and year(d.date) = year(m.date)
group by m.date
order by m.date
I think the values are too high due to cross-joining going on since the figures for each month are out by a constant factor, depending on the number of days in the daily table for that month.
My question is this: how do I aggregate the field in the monthly table without that factor coming into play and still get the min/max dates from the daily table for that month?