views:

186

answers:

3

I need to parallelize a method that does an exhaustive pairwise comparison on elements in a list. The serial implementation is straight-forward:

foreach (var element1 in list)
    foreach (var element2 in list)
        foo(element1, element2);

In this case, foo won't alter the state of element1 or element2. I know it's not safe to simply do nested Parallel.ForEach statements:

Parallel.ForEach(list, delegate(A element1)
{
    Parallel.ForEach(list, delegate(A element2)
    {
        foo(element1, element2);
    });
});

What would be the ideal way to implement this using the parallel tasks library?

+5  A: 

Couldn't you just have one Parallel and one normal loop? So either

Parallel.ForEach(list, delegate(A element1)
{
  foreach(A element2 in list)
    foo(element1, element2)
});

or

foreach(A element1 in list)
{
  Parallel.ForEach(list, delegate(A element2)
  {
    foo(element1, element2);
  });
}

Should speed it up as well. There was never going to be a thread per cycle anyway, so this would probably be just as fast or slightly slower than nested parallel loops.

Jouke van der Maas
+2  A: 

At least if you are executing the code on a machine where the number of cores is at least twice the number of items in the list, I'm not sure it is a good idea to do embedded Parallel.ForEachs.

In other words, if you target a quad-core, and the list has one thousand items, just parallelize the parent loop. Parallelizing both loops would not make the code faster, but rather much, much slower, since parallel tasks have performance cost.

alt text

At each iteration, a few milliseconds will be lost by Parallel.ForEach to determine which thread must execute the next iteration. Let's say you have a set of 7 items. If you parallelize the parent loop, those milliseconds will be lost 7 times. If you parallelize both loops, they will be lost 7×7=49 times instead. Larger is the set, bigger is the overheat.

MainMa
Don't assume that PFX will create as many threads as there are parallel tasks - it's smarter than that.
Jon Skeet
Of course not. By default, it creates as many threads as cores. But the problem is that after each iteration, it will spend time trying to find which thread must execute the next iteration.
MainMa
I don't think he's saying that there's going to be that many threads, just that enqueueing a task for every function call is going to have much more overhead than just invoking the PFX engine for each outer loop.
Gabe
Thanks MainMa. It's probably worth noting that foo is likely to take a long time compared to the cost of the Parallel.ForEach call. I do appreciate the general argument for only parallelizing one loop.
Wesley Tansey
A: 

The two nested loops essentially mean that you want to foo the cartessian product of the list with itself. You can parallelize the entire operation by first creating all pairs in a temporary list, then iterating over that list with Parallel.ForEach.

EDIT: Instead of creating a list of all combinations, you can use an iterator to return a 2-element tuple with the combination. Parallel.ForEach will still parallelize the processing of the tuples.

The following sample prints out the current iteration step to show that results come back out-of-order, as would be expected during parallel processing:

 const int SIZE = 10;
    static void Main(string[] args)
    {
        List<int> list = new List<int>(SIZE);
        for(int i=0;i<SIZE;i++)
        {
            list.Add(i);
        }


        Parallel.ForEach(GetCombinations(list),(t,state,l)=>
            Console.WriteLine("{0},{1},{2}",l,t.Item1,t.Item2));

    }

    static IEnumerable<Tuple<int,int>> GetCombinations(List<int> list)
    {
        for(int i=0;i<list.Count;i++)
            for(int j=0;j<list.Count;j++)
                yield return Tuple.Create(list[i],list[j]);
    }
Panagiotis Kanavos