views:

51

answers:

3

I've written some code to try and describe my concern:

static void Main(string[] args)
{
    IEnumerable<decimal> marks = GetClassMarks();
    IEnumerable<Person> students = GetStudents();

    students.AsParallel().ForAll(p => GenerateClassReport(p, marks));

    Console.ReadKey();
}

GetClassMarks uses yield return in it from my weird data source. Assume that GenerateClassReport does basically a marks.Sum()/marks.Count() to get the class average.

From what I understand, students.AsParallel().ForAll is a parallel foreach.

My worry is what is going to happen inside the GetClassMarks method.

  • Is it going to be enumerated once or many times?
  • What order is the enumeration going to happen in?
  • Do I need to do a .ToList() on marks to make sure it is only hit once?
A: 

Is it going to be enumerated once or many times?

Just once.

What order is the enumeration going to happen in?

The iterator (function using yield) determines the order.

Do I need to do a .ToList() on marks to make sure it is only hit once?

No.

AsParallel only iterates through its input once, partitioning the input into blocks which are dispatched to worker threads.

Richard
+1  A: 
  1. If GetClassMarks is an iterator -- that is, if it uses yield internally -- then it is effectively a query that will be re-executed whenever you call marks.Sum(), marks.Count() etc.

  2. It's almost impossible to predict the order of execution in a parallel query.

  3. Yes. The following will ensure that GetClassMarks is only executed once. Subsequent calls to marks.Sum(), marks.Count() etc will use the concrete list rather than re-executing the GetClassMarks query.

    List<decimal> marks = GetClassMarks().ToList();
    

Note that points 1 and 3 apply whether or not you're using AsParallel. The GetClassMarks query will be executed exactly the same number of times in either case (assuming that the rest of the code, except for the parallel aspects, is the same).

LukeH
+3  A: 

Is it going to be enumerated once or many times?

Assuming that GenerateClassReport() enumerates marks once, then marks will be enumerated once for each element in students.

What order is the enumeration going to happen in?

Each thread will enumerate the collection in its default order, but several threads will do so concurrently. The concurrent enumeration order is generally unpredictable. Also, you should note that the number of threads is limited and variable, so most likely not all of the enumerations will occur concurrently.

Do I need to do a .ToList() on marks to make sure it is only hit once?

If GetClassMarks() is an iterator (i.e. it uses the yield construct), then its execution will be deferred and it will be called once for each time marks is enumerated (i.e. once for each element in students). If you use IEnumerable<decimal> marks = GetClassMarks().ToList() or if GetClassMarks() internally returns a concrete list or array, then GetClassMarks() will be executed immediately and the results will be stored and enumerated in each of the parallel threads without calling GetClassMarks() again.

Jack Leitch