views:

3575

answers:

3

If I have a table with a title column and 3 bit columns (f1, f2, f3) that contain either 1 or NULL, how would I write the LINQ to return the title with the count of each bit column that contains 1? I'm looking for the equivalent of this SQL query:

SELECT title, COUNT(f1), COUNT(f2), COUNT(f3) FROM myTable GROUP BY title

I'm looking for the "best" way to do it. The version I came up with dips into the table 4 times when you look at the underlying SQL, so it's too slow.

A: 

I think this is where LINQ falls down. If you want efficient use the SQL, if you want nice code, use LINQ.

You could always execute the query directly, since you know the SQL already.

class TitleCount {
    public string Title;
    public int Count1;
    public int Count2;
    public int Count3;
}

DataContext dc = new DataContext("Connection string to db");

IEnumerable<TitleCount> query = dc.ExecuteQuery<TitleCount>(
    @"SELECT title, 
             COUNT(f1) as Count1, 
             COUNT(f2) as Count2, 
             COUNT(f3) as Count3 
       FROM myTable GROUP BY title");
Cameron MacFarland
A: 

If you want to stick to a LINQ query and use an anonymous type, the query could look like:

 var query = 
      from r in ctx.myTable
      group r by r.title into rgroup
      select new
      {
          Title = rgroup.Key,
          F1Count = rgroup.Count(rg => rg.f1 == true),
          F2Count = rgroup.Count(rg => rg.f2 == true),
          F3Count = rgroup.Count(rg => rg.f3 == true)
      };

The trick is to recognize that you want to count the number of true fields (it gets mapped as a nullable bool), which you can do with the Count operator and a predicate. More info on the LINQ group operator here: The Standard LINQ Operators

OdeToCode
That's the solution I came up with originally, but the resulting query uses sub-selects to produce each individual count. On a table with a half-million rows (no indexes), this version takes 7 seconds, while the proper SQL with 3 COUNT's is instantaneous.
gfrizzle
+1  A: 

Here's the solution I came up with. Note that it's close to the solution proposed by @OdeToCode (but in VB syntax), with one major difference:

Dim temp = _
    (From t In context.MyTable _
     Group t.f1, t.f2, t.f3 By t.title Into g = Group _
     Select title, g).ToList

Dim results = _
    From t In temp _
    Select t.title, _
        f1_count = t.g.Count(Function(x) If(x.f1, False)), _
        f2_count = t.g.Count(Function(x) If(x.f2, False)), _
        f3_count = t.g.Count(Function(x) If(x.f3, False))

The first query does the grouping, but the ToList gets the grouped data as-is from the server. Eliminating the counting here keeps the resulting SQL statement from producing sub-SELECTs for each count. I do the counting in the second query locally.

This works since I know the first query will return a manageable number of rows. If it were returning millions of rows, I'd probably have to go in another direction.

gfrizzle