tags:

views:

60

answers:

2

Given a datatable of single day dates (dd-mm-yyyy), (shown below on the left).

What's most elegant way to transform this into a collection of ranges, grouping consecutive days for each type (show on the right) ?

We can assume initial data is sorted by TypeID, then Date.

TypeID  | Date            ->   TypeID  | Start      | End
1       | 01/02/2010           1       | 01/02/2010 | 03/02/2010
1       | 02/02/2010           2       | 03/02/2010 | 04/02/2010
1       | 03/02/2010           2       | 06/02/2010 | 06/02/2010
2       | 03/02/2010
2       | 04/02/2010
2       | 06/02/2010

I'm not particularity up on my LINQ, but I was thinking that might be the way to go?

Any assistance would be greatly appreciated!

+2  A: 

Sure somebody will come up with a neat LINQ solution but here's my old style solution. Sure it's got problems and won't cope with every combination, cobbled together and solves the problem but not elegant in any way :-(

internal class SourceData
{
    public int TypeId { get; set; }
    public DateTime Date { get; set; }
}

internal class Result
{
    public int TypeId { get; set; }
    public DateTime StartDate { get; set; }
    public DateTime EndDate { get; set; }
}

class Program
{

    static void Main()
    {

        var a = new List<SourceData> {
            new SourceData {TypeId = 1, Date = new DateTime(2010, 02, 01)},
            new SourceData {TypeId = 1, Date = new DateTime(2010, 02, 02)},
            new SourceData {TypeId = 1, Date = new DateTime(2010, 02, 03)}, 
            new SourceData {TypeId = 2, Date = new DateTime(2010, 02, 03)}, 
            new SourceData {TypeId = 2, Date = new DateTime(2010, 02, 04)}, 
            new SourceData {TypeId = 2, Date = new DateTime(2010, 02, 06)} 
        };

        var results = new List<Result>();
        int currentTypeId = 1;
        var rangeEndDate = new DateTime();

        DateTime rangeStartDate = a[0].Date;
        DateTime currentDate = a[0].Date;

        for (int i = 1; i < a.Count() ; i++)
        {

            if (a[i].TypeId != currentTypeId)
            {
                results.Add(new Result() { TypeId = currentTypeId, StartDate = rangeStartDate, EndDate = rangeEndDate });
                currentTypeId += 1;                    
                rangeStartDate = a[i].Date;
            }

            TimeSpan tSpan = a[i].Date - currentDate;
            int differenceInDays = tSpan.Days;

            if(differenceInDays > 1)
            {
                results.Add(new Result { TypeId = currentTypeId, StartDate = rangeStartDate, EndDate = a[i-1].Date });
                rangeStartDate = a[i].Date;
            }

            rangeEndDate = a[i].Date;
            currentDate = a[i].Date;
        }

        results.Add(new Result { TypeId = currentTypeId, StartDate = rangeStartDate, EndDate = rangeEndDate });

        Console.WriteLine("Output\n");
        foreach (var r in results)
            Console.WriteLine( string.Format( "{0} - {1} - {2}",r.TypeId,r.StartDate.ToShortDateString(),r.EndDate.ToShortDateString()));

        Console.ReadLine();

    }
}

Gives the following output :-

Output

1 - 01/02/2010 - 03/02/2010

2 - 03/02/2010 - 04/02/2010

2 - 06/02/2010 - 06/02/2010

Andy Robinson
This is almost the same approach as I'm currently using, it does the job, and it's readable, so don't put yourself down :)
Zeus
+2  A: 

NOTE Previous answer removed.

EDIT Try this revised one:

public static IEnumerable<TAnonymous> Flatten<T, TAnonymous>(
    this IEnumerable<T> enumerable,
    Func<T, T, bool> criteria,
    Func<T, T, TAnonymous> selector,
    Func<TAnonymous, T, T, TAnonymous> modifier)
{
    var list = new List<TAnonymous>();

    T last = default(T);
    bool first = true;
    bool created = false;

    TAnonymous current = default(TAnonymous);

    Action<T, T> action = (a, b) =>
                          {
                              if (criteria(a, b)) {
                                  if (created) {
                                      current = modifier(current, a, b);
                                  } else {
                                      current = selector(a, b);
                                      created = true;
                                  }
                              } else {
                                  if (created) {
                                      list.Add(current);
                                      current = default(TAnonymous);
                                      created = false;
                                  } else {
                                      list.Add(selector(a, a));
                                  }
                              }
                          };

    foreach (T item in enumerable) {
        if (first) {
            first = false;
            last = item;
            continue;
        }

        action(last, item);
        last = item;
    }

    action(last, last);

    if (created)
        list.Add(current);

    return list;
}

Called as:

var filtered = list.Flatten(
    (r1, r2) => ((r2.Date - r1.Date).Days <= 1 && r1.TypeID == r2.TypeID),
    (r1, r2) => new { r1.TypeID, Start = r1.Date, End = r2.Date },
    (anon, r1, r2) => new { anon.TypeID, anon.Start, End = r2.Date });

Should hopefully work... What we are doing this time, is breaking up the operations into stages, first matching a criteria, and then either creating a new item (selector), or updating the previously created item (modifier).

Matthew Abbott
Thanks for taking the time to write that out, is rather interesting, especially given that I don't know much LINQ! Also, I think this is a slightly more complicated problem than it first appears as you've said. quite fun :)
Zeus
There's enough code in there to make your head spin! Seems like a procedural approach might be easier to maintain and understand.
Phil
I agree, to do what he needs, I don't think Linq is really suited.
Matthew Abbott
See updated answer
Matthew Abbott