views:

222

answers:

3

I've got some objects which implement this interface:

public interface IRow
{
  void Fill(DataRow dr);
}

Usually when I select something out of db, I go:

public IEnumerable<IRow> SelectSomeRows
{
  DataTable table = GetTableFromDatabase();
  foreach (DataRow dr in table.Rows)
  {
    IRow row = new MySQLRow(); // Disregard the MySQLRow type, it's not important
    row.Fill(dr);
    yield return row;
  }
}

Now with .Net 4, I'd like to use AsParallel, and thus LINQ.

I've done some testing on it, and it speeds up things alot (IRow.Fill uses Reflection, so it's hard on the CPU)

Anyway my problem is, how do I go about creating a LINQ query, which calls Fills as part of the query, so it's properly parallelized?

For testing performance I created a constructor which took the DataRow as argument, however I'd really love to avoid this if somehow possible.

With the constructor in place, it's obviously simple enough:

public IEnumerable<IRow> SelectSomeRowsParallel
{
  DataTable table = GetTableFromDatabase();
  return from DataRow dr in table.Rows.AsParallel()
         select new MySQLRow(dr);
}

However like I said, I'd really love to be able to just stuff my Fill method into the LINQ query, and thus not need the constructor overload.

+3  A: 

You need to make a multi-statement lambda expression, like this:

table.AsEnumerable().AsParallel().Select(dr => 
    IRow row = new MySQLRow(); 
    row.Fill(dr);
    return row;
});
SLaks
You were first, so you get the "correct" flag :-)Thanks for the many answers, always nice to learn something.
Steffen
+1  A: 

The answer is luckily very simple. Just do it :) There's nothing keeping you from simply calling a method in the select part of the query

public IEnumerable<IRow> SelectSomeRowsParallel
        {
          DataTable table = GetTableFromDatabase();
          return from DataRow dr in table.Rows.AsParallel()
                 select (row => 
                         var mysqlRow = new MySQLRow()
                         mysqlRow.Fill(row);
                         return mysqlRow;)
        }

I'm not sure that you can stuff the lambda in there (been a few years since I had the chance to write LINQ) if you cannot assign it to a Func

Func<IRow,DataRow> getRow = 
                     (row => 
                     var mysqlRow = new MySQLRow()
                     mysqlRow.Fill(row);
                     return mysqlRow;)

and then call that in your select clause

Rune FS
Did you try compiling this, or at least typing it into C# editor which detects syntax errors?
Tomas Petricek
@Thomas nope that's rather hard on my phone but thx for the heads up. Tried compiling in my head and found a few. Nothing affecting the point of the answer but the code is more compiler friendly at least :)
Rune FS
+1  A: 

I don't think there is any way to place an imperative operation (such as a call to the Fill method which returns void) into the LINQ query syntax, but you can do the same thing using the explicit call to Select method, which allows you to use arbitrary code:

DataTable table = GetTableFromDatabase(); 
return table.Rows.Cast<DataRow>().AsParallel().Select(dr => {
  IRow row = new MySQLRow(); 
  row.Fill(dr); 
  return dr; });

You need to add the call to Cast (because DataSets don't implement generic version of IEnumerable) and the rest of the code is pretty straightforward. Your original query would translate to exactly these calls.

If you wanted to do some tricks, you could modify the interface, such that the Fill method returns something (e.g. int). Then you could use the let clause and ignore the returned value.

return from DataRow dr in table.AsParallel() 
       let IRow row = new MySQLRow()
       let _ = row.Fill(dr) // ignoring return value; '_' is just variable name
       select row;

It is possible to use this trick call methods that return something, but not methods that return void.

Tomas Petricek