views:

664

answers:

1

ParallelEnumerable has a static member AsParallel. If I have an IEnumerable<T> and want to use Parallel.ForEach does that imply that I should always be using AsParallel?

e.g. Are both of these correct (everything else being equal)?

without AsParallel:

List<string> list = new List<string>();
Parallel.ForEach<string>(GetFileList().Where(file => reader.Match(file)), f => list.Add(f));

or with AsParallel?

List<string> list = new List<string>();
Parallel.ForEach<string>(GetFileList().Where(file => reader.Match(file)).AsParallel(), f => list.Add(f));
+5  A: 

It depends on what's being called, they are separate issues.

.AsParallel() Parallelizes the enumeration not the delegation of tasks.

Parallel.ForEach Parallelized the loop, assigning tasks to worker threads for each element.

So unless your source enumeration gains from becoming parallel (e.g. reader.Match(file) is expensive), they are equal. To your last question, yes, both are also correct.

Also, there's one other construct you may want to look at that shortens it a bit, still getting maximum benefit of PLINQ:

GetFileList().Where(file => reader.Match(file)).ForAll(f => list.Add(f));
Nick Craver
Hmmm... what exactly is parrallelizing the enumeration? or at least how can that paralleization be separated from the task delegation?
dkackman
@dkackman `.AsParallel()` readies the numeration for parallel execution, specifically the parallel version of `.SelectMany()` in this case. Think of a enumeration that has a hefty `Where` clause but no order, we could evaluate that where clause simultaneously across as many cores as possible giving the next in the enumeration to the next available thread, making it almost `n` times faster. What we do with that result can also be handled the same way afterwards, either synchronously in one thread or spread across cores as available, that's the `Parallel.ForEach` or `.ForAll` part, make sense?
Nick Craver
That does make sense. Thanks Nick.
dkackman