views:

143

answers:

1

I am new to C#, Parallel.ForEach, and .NET in general. I want to parallelize a search that involves thousands of locations. For each location, I compute great circle distance. That is a calculation I want to spread to different cores. My question is how do I do this if I only have one thread-local variable, as in this MSDN TPL example? For the result, I looked at Interlocked, and saw its options Add, CompareExchange, Decrement, Exchange, Increment and Read, but I'm not just adding, incrementing, decrementing, or testing for equality. I want to return the object, over several threads running in parallel, that has the shortest overall distance. My gut says this should be easy, that I should be able to create some little object that wraps a Location and a distance, but how do I capture the best answer from each thread and then choose the shortest distance among them? Here is the non-parallel version:

Location findClosestLocation(Location myLocation, List<Location> allLocations)
{
  double closest = double.MaxValue;
  Location closestLoc = null;
  foreach (Location aLoc in allLocations)
  {
    if (aLoc != myLocation)
    {
      double d = greatCircle(myLocation, aLoc);
      if (d < closest)
      {
        closest = d;
        closestLoc = aLoc;
      }
    }
  }
  return closestLoc;
}

I did see a DDJ Blog Post that seemed to offer good advice, but I wondered if it was the best advice. I see the author looping over arrays, and wonder if there isn't a more functional way of doing this. In the functional world I would use map, lambda and min.

+5  A: 

The easiest option here would be to switch to PLINQ:

Location findClosestLocation(Location myLocation, List<Location> allLocations)
{
     return allLocations
               .AsParallel()
               .Min(location => greatCircle(myLocation, location));
}

That being said, this is basically just aggregation with parallel constructs. You have a couple of options if you want to stick to the Parallel class. One option would be to synchronize this yourself within the block, using locking. I wouldn't recommend this, as it will hurt your overall performance.

The better option is to use the Parallel.ForEach methods which provide for local state. They would allow you to rewrite this as:

Location findClosestLocation(Location myLocation, List<Location> allLocations)
{
  double closest = double.MaxValue;
  Location closestLoc = null;
  object sync = new object();

  Parallel.ForEach<Location, Tuple<double,Location>(
      allLocations,
      () => new Tuple(double.MaxValue, null),
      (location, loopState, localState) =>
      {
          double d = greatCircle(myLocation, aLoc);
          if (d < localState.Item1)
              return new Tuple(d, aLoc);
          else
              return localState;
      },
      localState =>
      {
          lock(sync)
          {
              if (localState.Item1 < closest)
              {
                  closest = localState.Item1;
                  closestLoc = localState.Item2;
              }
          }
      }
  );
  return closestLoc;
}

I cover using local state for aggregations in detail on my blog. This basically changes the operation to one lock operation per thread instead of one lock per processing element, so you get much higher throughput than a naive locking solution.

Reed Copsey
Wow, that's about as short and sweet as it gets. It makes my functional heart sing. Thanks!
gknauth
@gknauth: Yeah - I added the (correct/performant) Parallel.ForEach option too. I recommend learning **how** the other option works, and trying to understand it. It's valuable, though, for most of these operations, I just use PLINQ since it's awesome.
Reed Copsey
Thanks for clueing me into your blog too, looks great. Write a book, I'll buy it! Or if that's too much work, I'll buy you dinner.
gknauth
@gknauth: Heh, thanks - I've considered it, but there are some good ones coming out shortly. The Patterns and Practices TPL book will be out electronically in August (print in Oct.). I helped review it - it's a good one. In the meantime, I've got a ton of stuff on this topic on my blog (with more on the way ;) )
Reed Copsey