tags:

views:

52

answers:

4

I have a seemingly simple requirement, but i can't figure out how to write it as a query that only has one round trip to the server.

Basically i have a simple table

CREATE TABLE Item
(
    id int not null identity(1,1),
    create datetime not null,
    close datetime --null means not closed yet
);

and what i want to do is over a range of time (say 1/1/2010 to 6/1/2010), for each month i need the number of items that were active in that month. an item is active if it was created either during or before that month and is either not closed (i.e. closed is null) or was closed after that month. So i translated that into a linq expression using a helper method:

//just returns the first day of every month inbetween min and max (inclusive)
private IEnumerable<DateTime> EnumerateMonths(DateTime Min, DateTime Max)
{
    var curr = new DateTime(Min.Year, Min.Month, 1);
    var Stop = new DateTime(Max.Year, Max.Month, 1).AddMonths(Max.Day == 1 ? 0 : 1);
    while(curr < Stop)
    {
        yield return curr;
        curr = curr.AddMonths(1);
    }
}

public List<DataPoint> GetBacklogByMonth(DateTime min, DateTime max)
{
    return EnumerateMonths(min, max)
        .Select(m => new DataPoint
                        {
                            Date = m,
                            Count = DB.Items.Where(s => s.Create <= m.AddMonths(1) && (!s.Close.HasValue || s.Close.Value >= m.AddMonths(1)))
                                    .Count()
                            }
            ).ToList();
}

which works perfectly, except each Count is a separate query so its super slow (a round trip for each month), so my question is how could i restructure this query to do this in one round trip to the server.

Initially i thought about doing some sort of group by so aggregate by month, but because each item could be 'active' in many different months i don't think that would work.

Any suggestions?

A: 

Just pull your items first and then roll over your months using the in-memory collection. I'm not sure I've got your criteria right for the db query, but it would basically be:

var items = Db.Items.Where(s => s.Create <= min 
    && (!s.Close.HasValue || s.Close.Value >= max)).ToList();

return EnumerateMonths(min, max).Select(m => new DataPoint
    {
        Date = m,
        Count = items.Where(s => s.Create <= m.AddMonths(1) && (!s.Close.HasValue || s.Close.Value >= m.AddMonths(1))).Count()
    }).ToList();
Jay
this should work but its possible that it could pull down several thousand rows, this may be too much network traffic. ill try it out and see if it is acceptable.
luke
If the rows are large, you can `Select()` just the `Create` and `Close` columns. Another alternative, if you are using MS SQL Sever, is to create a CLR function that will do the work on the server side. You can then add this function as a method in your LINQ-to-SQL mapping. http://msdn.microsoft.com/en-us/library/ms189876.aspx
Jay
A: 

One of the 101 LINQ samples is a nested doing year and then month

http://msdn.microsoft.com/en-us/vcsharp/aa336754.aspx#nested

James Manning
A: 

I would go with what Jay says. I had a similar situation. Doing your sorting/query in-memory would work faster than hitting DB multiple times.

If you know ahead of time that you are only going to do a read, set your objectContext.Table to MergeOption.NoTracking and iterate using a foreach loop.

If you still need tracking,detach the object from the dataContext after you use it

var results = from t in DBContext.table select t where t.criteria=your select criteria
foreach (var result in results)
{
  DoSomething(result);
  DbContext.Detach(result);
}

Alternatively, if you are using no tracking, you do not need to detach your objects

ram
A: 

I hate to answer my own question but here is what i did.

What i really needed to do all along was a left join with a table of months then do a group and a count on the number of items for each month. a normal grouping on month wouldn't work because then items would only get counted in a single month not all the ones they were active for. So I added a table Months containing just dates of the first of the month and did a left join on it. This operation needs to be done often enough that i figured it was worth adding a table for it.

heres the final query:

        var joins = from m in DB.Months
                    from s in DB.Items
                    let nm = m.month.AddMonths(1)
                    where s.Create < nm && (!s.Close.HasValue || s.Close.Value >= nm) && m.month >= min && m.month <= max
                    select new { d = m.month, id = s.ID };
        var counts = from j in joins
                     group j by j.d into g
                     select new DataPoint { Date = g.Key, Count = g.Key > DateTime.Now ? 0 : g.Count() };

I also added some code to make sure that months has the correct rows in it for my query.

luke