views:

156

answers:

2

I have a bunch of rows grouped on an attribute called MyID. Now I want the one row from each group where the StatusDate attribute is the highest in that one group.

This is what I've come up with.

rows.Select(x => x.Where(y => y.StatusDate == x.Max(z => z.StatusDate)).First())

With a bit more explanation:

rows.Select(x => // x is a group
  x.Where(y => // get all rows in that group where...
               // the status date is equal to the largest
               // status date in the group
    y.StatusDate == x.Max(z => z.StatusDate)
  ).First()) // and then get the first one of those rows

Is there any faster or more idiomatic way to do this?

+6  A: 

One alternative would be to use:

rows.Select(x => x.OrderByDescending(y => y.StatusDate).First());

... and check that the query optimiser knows that it doesn't really need to sort everything. (This would be disastrous in LINQ to Objects, but you could use MaxBy from MoreLINQ in that case :)

(Apologies for previous version - I hadn't fully comprehended the grouping bit.)

Jon Skeet
A little off topic but... Can you post an example or link a place where we can see MoreLINQ in action? That `MaxBy` really got me thrilled.
Alex Bagnolini
@Alex: Do you mean just sample code? In this case it would be `rows.Select(x => x.MaxBy(y => y.StartDate))` - but we don't have a "showcase" of these things, which perhaps we should...
Jon Skeet
This is a good option, but it is hard for me to say which results in more efficient runtime TSQL. Your own solution is pretty good already. I'd run both of these through the debugger to find out what the generated TSQL looks like, then run each through SQL Management Studio directly and take a look at the execution plans, and maybe even run it through performance analyzer and/or profiler too.
Stephen M. Redd
@Jon: There is a really skeletal wiki page on the MoreLINQ project page, and I would probably be just one of those users who would LOVE to know more about how it works and see some examples too!
Alex Bagnolini
@Stephen: I completely agree that blahblah needs to check this in the query optimiser. @Alex: It's all a matter of finding time :) The "how" is easy enough - none of the source is particularly tricky to understand. I just need to find time to provide some examples at some point...
Jon Skeet
Well, you could just use one of them and wait until you have an actual performance problem to do the analysis. I've never been a big fan of pre-optimization for queries... it takes a LOT of time and until you know there is a real problem you can't justify it in terms of ROI.
Stephen M. Redd
A: 

Don't know if this is Linq to SQL, but if it is, you could alternatively accomplish via a rank() function in SQL (rank each group by date, then select the first ranked row from each), then call this as a stored proc from LINQ. I think that's an approch that is becoming more idiomatic as people hit the bounderies of LINQ2SQL...

Codewerks