tags:

views:

1522

answers:

3

I'm trying to chart the number of registrations per day in our registration system. I have an Attendee table in sql server that has a smalldatetime field A_DT, which is the date and time the person registered.

I started with this:

var dailyCountList =
    (from a in showDC.Attendee
    let justDate = new DateTime(a.A_DT.Year, a.A_DT.Month, a.A_DT.Day)
    group a by justDate into DateGroup
    orderby DateGroup.Key
    select new RegistrationCount
    {
     EventDateTime = DateGroup.Key,
     Count = DateGroup.Count()
    }).ToList();

That works great, but it won't include the dates where there were no registrations, because there are no attendee records for those dates. I want every date to be included, and when there is no data for a given date, the count should just be zero.

So this is my current working solution, but I KNOW THAT IT IS TERRIBLE. I added the following to the code above:

// Create a new list of data ranging from the beginning to the end of the first list, specifying 0 counts for missing data points (days with no registrations)
var allDates = new List<RegistrationCount>();
for (DateTime date = (from dcl in dailyCountList select dcl).First().EventDateTime; date <= (from dcl in dailyCountList select dcl).Last().EventDateTime; date = date.AddDays(1))
{
    DateTime thisDate = date; // lexical closure issue - see: http://www.managed-world.com/2008/06/13/LambdasKnowYourClosures.aspx
    allDates.Add(new RegistrationCount
    {
     EventDateTime = date,
     Count = (from dclInner in dailyCountList
     where dclInner.EventDateTime == thisDate
     select dclInner).DefaultIfEmpty(new RegistrationCount
     {
      EventDateTime = date,
      Count = 0
     }).Single().Count
    });
}

So I created ANOTHER list, and loop through a sequence of dates I generate based on the first and last registrations in the query, and for each item in the sequence of dates, I QUERY the results of my first QUERY for the information regarding the given date, and supply a default if nothing comes back. So I end up doing a subquery here and I want to avoid this.

Can anyone thing of an elegant solution? Or at least one that is less embarrassing?

+1  A: 

O(n) with 2 enumerations. It's very good to pull the items into memory before trying this. Database has enough to do without thinking about this stuff.

  if (!dailyCountList.Any())
      return;

  //make a dictionary to provide O(1) lookups for later

  Dictionary<DateTime, RegistrationCount> lookup = dailyCountList.ToDictionary(r => r.EventDateTime);

  DateTime minDate = dailyCountList[0].EventDateTime;
  DateTime maxDate = dailyCountList[dailyCountList.Count - 1].EventDateTime;

  int DayCount = 1 + (int) (maxDate - minDate).TotalDays;

  // I have the days now.
  IEnumerable<DateTime> allDates = Enumerable
    .Range(0, DayCount)
    .Select(x => minDate.AddDays(x));

  //project the days into RegistrationCounts, making up the missing ones.
  List<RegistrationCount> result = allDates
      .Select(d => lookup.ContainsKey(d) ? lookup[d] :
          new RegistrationCount(){EventDateTime = d, Count = 0})
      .ToList();
David B
This works and it seems like a very good approach. For some reason, though, it takes longer to execute in LINQPad (.2 sec as opposed to .15 sec with my version). I'm going to assume that this dictionary version is overall better, and there's just a little overhead creating the dictionary and range.
Chris
I didn't notice the first time I read the question, but dailyCountList is ordered by EventDate. This makes finding the min/max dates trivial (enumeration is not required).
David B
Also, allDates did not need to be Listified before being used. Editted to leave it as a query.
David B
A: 

Does this syntax for left outer joins no longer work as well after SP1, then?

Usually, you should able to do the following, but you'd need a calendar table of sorts in your SQL database joined to your date key in the registrations table (w/a foreign key on the date id field), and then try:

var query =
    from cal in dataContext.Calendar
    from reg in cal.Registrations.DefaultIfEmpty()
    select new
    {
        cal.DateID,
        reg.Something
    };
Codewerks
A: 

The problem is that you have no range of dates without executing your query. So, you can either pick a date range, run a SELECT MAX and SELECT MIN against your DB, or execute your query and then add the missing dates.

var allDailyCountList =
   from d in Range(dc[0].EventDateTime, dc[dc.Count - 1].EventDateTime) 
   // since you already ordered by DateTime, we don't have to search the entire List
   join dc in dailyCountList on
      d equals dc.EventDateTime
   into rcGroup
   from rc in rcGroup.DefaultIfEmpty(
      new RegistrationCount()
      {
         EventDateTime = d,
         Count = 0
      }
   ) // gives us a left join
   select rc;

public static IEnumerable<DateTime> Range(DateTime start, DateTime end) {
   for (DateTime date = start, date <= end; date = date.AddDays(1)) {
      yield return date;
   }
}
Mark Brackett