views:

624

answers:

5

I want to find the fastest way to get the min and max of a column in a table with a single Linq to SQL roundtrip. So I know this would work in two roundtrips:

int min = MyTable.Min(row => row.FavoriteNumber);
int max = MyTable.Max(row => row.FavoriteNumber);

I know I can use group but I don't have a group by clause, I want to aggregate over the whole table! And I can't use the .Min without grouping first. I did try this:

from row in MyTable 
group row by true into r 
select new { 
    min = r.Min(z => z.FavoriteNumber), 
    max = r.Max(z => z.FavoriteNumber) 
}

But that crazy group clause seems silly, and the SQL it makes is more complex than it needs to be.

So, is there any way to just get the correct SQL out?

EDIT: These guys failed too: http://stackoverflow.com/questions/1597181/linq-to-sql-how-to-aggregate-without-a-group-by ... lame oversight by LINQ designers if there's really no answer.

EDIT 2: I looked at my own solution (with the nonsensical constant group by clause) in the SQL Server Management Studio execution plan analysis, and it looks to me like it is identical to the plan generated by:

SELECT MIN(FavoriteNumber), MAX(FavoriteNumber)
FROM MyTable

so unless someone can come up with a simpler-or-equally-as-good answer, I think I have to mark it as answered-by-myself. Thoughts?

A: 

You could select the whole table, and do your min and max operations in memory:

var cache = // select *

var min = cache.Min(...);
var max = cache.Max(...);

Depending on how large your dataset is, this might be the way to go about not hitting your database more than once.

roosteronacid
At least comment on the down-votes...
roosteronacid
I voted you up one, cause I didn't think you deserved a -1 either, as you met the stated goal. However, my reason for using a single round trip was a) tidiness and b) efficiency, so pulling the whole table would hurt that: not only does it have to transfer the whole table but also it would have to do the min and max by searching through the list twice rather than using the database's indexes.
Scott Stafford
I agree. But if it's a small table, it won't hurt as much as the SQL generated by your code... Performance-wise. And thanks :)
roosteronacid
+1  A: 

A LINQ to SQL query is a single expression. Thus, if you can't express your query in a single expression (or don't like it once you do) then you have to look at other options.

Stored procedures, since they can have statements, enable you to accomplish this in a single round-trip. You will either have two output parameters or select a result set with two rows. Either way, you will need custom code to read the stored procedure's result.

(I don't personally see the need to avoid two round-trips here. It seems like a premature optimization, especially since you will probably have to jump through hoops to get it working. Not to mention the time you will spend justifying this decision and explaining the solution to other developers.)

Put another way: you've already answered your own question. "I can't use the .Min without grouping first", followed by "that crazy group clause seems silly, and the SQL it makes is more complex than it needs to be", are clues that the simple and easily-understood two-round-trip solution is the best expression of your intent (unless you write custom SQL).

Bryan Watts
True enough. If I take as given that the solution is complex enough to use a stored procedure or causes me to use SQL directly, than it's not worth it. I am hoping that there is a simple syntax for using the aggregate functions without a grouping that I just overlooked. But if there isn't, I agree that an extra roundtrip (and maybe an extra database-side pass) is not worth much trouble.
Scott Stafford
+2  A: 

I'm not sure how to translate it into C# yet (I'm working on it)

This is the Haskell version

minAndMax :: Ord a => [a] -> (a,a)
minAndMax [x]    = (x,x)
minAndMax (x:xs) = (min a x, max b x)
                   where (a,b) = minAndMax xs

The C# version should involve Aggregate some how (I think).

Matt Ellen
I'm gonna try this for fun :)
leppie
Ace! I'm stuck just trying to express my code in foldr!
Matt Ellen
+3  A: 

I could find only this one which produces somewhat clean sql still not really effective comparing to select min(val), max(val) from table:

var r =
  (from min in items.OrderBy(i => i.Value)
   from max in items.OrderByDescending(i => i.Value)
   select new {min, max}).First();

the sql is

SELECT TOP (1)
    [t0].[Value],
    [t1].[Value] AS [Value2]
FROM
    [TestTable] AS [t0],
    [TestTable] AS [t1]
ORDER BY
    [t0].[Value],
    [t1].[Value] DESC

still there is another option to use single connection for both min and max queries (see Multiple Active Result Sets (MARS))

or stored procedure..

dh
Top marks for creativity! I tried that in the SQL analyzer, though, and ... not pretty. The custom ordering job it has to do is apparently very painful.
Scott Stafford
+2  A: 

As stated in the question, this method seems to actually generate optimal SQL code, so while it looks a bit squirrely in LINQ, it should be optimal performance-wise.

from row in MyTable  
group row by true into r  
select new {  
    min = r.Min(z => z.FavoriteNumber),  
    max = r.Max(z => z.FavoriteNumber)  
} 
Scott Stafford