views:

1113

answers:

3

I'm trying to convert some SQL queries into Linq to avoid multiple trips to the database.

The old SQL I'm trying to convert does:

SELECT
    AVG(CAST(DATEDIFF(ms, A.CreatedDate, B.CompletedDate) AS decimal(15,4))),
    AVG(CAST(DATEDIFF(ms, B.CreatedDate, B.CompletedDate) AS decimal(15,4)))
FROM
    dbo.A
INNER JOIN
    dbo.B ON B.ParentId = A.Id

So I've created two C# classes:

class B
{
    public Guid Id { get; set; }
    public DateTime CreatedDate { get; set; }
    public DateTime CompletedDate { get; set; }
}

class A
{
    public Guid Id { get; set; }
    public DateTime CreatedDate { get; set; }
    public List<B> BList { get; set; }
}

And I've got a List<A> object that I want to query. It's populated from the database, so each A in the list has a sub-list with a load of Bs. I want to use Linq-to-objects to query this list.

So I need to use Linq to get the average time between an A's start and the completion of its child Bs, and the average time between each B's start and completion. I didn't write the original SQL so I'm not entirely sure it does what it's supposed to!

I've got several of these averages to calculate, so I'd like to do them all inside one magical Linq query. Is this possible?

+1  A: 

Lets start by getting the average time between B's start and finish:

1) You first need the time difference between the start and finish of each B object so you would do this:

List<TimeSpan> timeSpans = new List<TimeSpan>();
foreach (B myB in BList)
{
  TimeSpan myTimeSpan = myB.CompletedDate.Subtract(myB.CreatedDate);
  timeSpans.Add(myTimeSpan);
}

2) You now need to get the average of this. You can do this by using lambda expressions:

//This will give you the average time it took to complete a task in minutes
Double averageTimeOfB = timeSpans.average(timeSpan => timeSpan.TotalMinutes);

You can now do the same thing to get the average time between A's start and B's finish as follows:

1)Get the time difference for A's start date and each of B's completion date:

 List<TimeSpan> timeSpans_2 = new List<TimeSpan>();
 foreach (B myB in BList)
 {
    TimeSpan myTimeSpan = myB.CompletedDate.Subtract(objectA.CreatedDate);
    timeSpans_2.Add(myTimeSpan);
 }

2) Get the average of these time differnces exactly like the one above:

//This will give you the average time it took to complete a task in minutes
Double averageTimeOfAandB = timeSpans_2.average(timeSpan => timeSpan.TotalMinutes);

And you're done. I hope that this helps. Please note that this code has not been tested.

EDIT: Remember that LINQ uses Lamda Expressions but is more of syntactic sugar (don't know much about the implementation so I hope you don't hold it against me). And also you could wrap the implementaions I provide in methods so that you can call them multiple times for a List of A objects.

Draco
Thanks, this will indeed do the job. However as I've got 9 average/min/max etc type things to calculate from this data, I was wondering if it could all be done inside one Linq query, similar to the original SQL .
Graham Clark
oops, just saw your edit. It looks like this will be the easiest way. I saw the Average methods in Linq and thought they might be the way forward. But maybe not.
Graham Clark
Sorry getting a bit confused , you mean that given an object of type A you want to calculate not only the average but also the min/max..etc?
Draco
Well, the original SQL contains 9 different operations on the data, calculating various averages, minimums, maximums etc. I just picked the two hardest ones for my example and thought I could work the rest out from there. Sorry for the confusion!
Graham Clark
Hmmmm...You could substitute .avarege for .min or .max and work from there?
Draco
+2  A: 

Sql and Linq differ on certain points. Sql has implicit grouping - take all of these and make one group. In linq, one may fake implicit grouping by introducing a constant to group on.

var query = 
  from i in Enumerable.Repeat(0, 1)
  from a in AList
  from b in a.BList
  select new {i, a, b} into x
  group x by x.i into g
  select new
  {
    Avg1 = g.Average(x => (x.b.CompletedDate - x.a.CreatedDate)
       .TotalMilliseconds ),
    Avg2 = g.Average(x => (x.b.CompletedDate - x.b.CreatedDate)
       .TotalMilliseconds )
  };
David B
This is great too, and I might end up using this implementation in the end. If only I could have two answers!
Graham Clark
+3  A: 

The more interesting question would be how to do such a thing on the server, for example to make the following query translate to LINQ to SQL.

        var q = from single in Enumerable.Range(1, 1)
                let xs = sourceSequence
                select new
                {
                    Aggregate1 = xs.Sum(),
                    Aggregate2 = xs.Average(),
                    // etc
                };

But there's no point trying to cram it into one query if you're using LINQ to Objects. Just write the aggregates separately:

var aggregate1 = xs.Sum();
var aggregate2 = xs.Average();
// etc

In other words:

var xs = from a in listOfAs
         from b in a.Bs
         select new { A=a, B=b };

var average1 = xs.Average(x => (x.B.Completed - x.A.Created).TotalSeconds);
var average2 = xs.Average(x => (x.B.Completed - x.B.Created).TotalSeconds);

Did I understand right?

Edit: David B answered similarly. I forgot to convert the TimeSpan to a simple numeric type supported by the Average method. Use TotalMilliseconds/Seconds/Minutes or whatever scale is appropriate.

Pete Montgomery
Thanks, this does the job perfectly! I've gone with the separate aggregate option - this makes things a bit clearer.
Graham Clark