I have a seemingly simple requirement, but i can't figure out how to write it as a query that only has one round trip to the server.
Basically i have a simple table
CREATE TABLE Item
(
id int not null identity(1,1),
create datetime not null,
close datetime --null means not closed yet
);
and what i want to do is over a range of time (say 1/1/2010 to 6/1/2010), for each month i need the number of items that were active in that month. an item is active if it was created either during or before that month and is either not closed (i.e. closed is null) or was closed after that month. So i translated that into a linq expression using a helper method:
//just returns the first day of every month inbetween min and max (inclusive)
private IEnumerable<DateTime> EnumerateMonths(DateTime Min, DateTime Max)
{
var curr = new DateTime(Min.Year, Min.Month, 1);
var Stop = new DateTime(Max.Year, Max.Month, 1).AddMonths(Max.Day == 1 ? 0 : 1);
while(curr < Stop)
{
yield return curr;
curr = curr.AddMonths(1);
}
}
public List<DataPoint> GetBacklogByMonth(DateTime min, DateTime max)
{
return EnumerateMonths(min, max)
.Select(m => new DataPoint
{
Date = m,
Count = DB.Items.Where(s => s.Create <= m.AddMonths(1) && (!s.Close.HasValue || s.Close.Value >= m.AddMonths(1)))
.Count()
}
).ToList();
}
which works perfectly, except each Count
is a separate query so its super slow (a round trip for each month), so my question is how could i restructure this query to do this in one round trip to the server.
Initially i thought about doing some sort of group by so aggregate by month, but because each item could be 'active' in many different months i don't think that would work.
Any suggestions?