views:

98

answers:

2

I'm working on a small simulation that is running on my 8-core workstation. The simulation involves modeling the interactions between a large number of independent nodes. During one phase I need to perform a series of simple atomic operations to each node in parallel. I have been using Parallel.ForEach from System.Threading.Tasks to apply the operation concurrently to each node in the list of all nodes.

This worked well for the 100-500 nodes I used for testing. The load was balanced very well with all cores constantly utilized. Unfortunately, when I attempt to run the simulation with the main dataset (5000+ nodes), everything goes wrong. All 8 cores stay idle most of the time, spiking to 100% every few seconds and then returning to 1% utilization. After a few minutes of this an OutOfMemoryException is thrown and the program crashes.

I am not completely sure what is wrong, but remain suspicious that my current code is spawning many more threads than would be optimal for the task. I think the ideal method would be for the model to detect the number of available cores N, partition the list of nodes into N segments, then spawn N threads, giving each thread a separate partition of the list.

What I'd like to ask is if this is indeed a good solution to the problem, do better ones exist, and how should it be implemented in C#? Any advice or comments are welcome.

EDIT: Code sample by request

Parallel.ForEach(listOfNodes, tempNode =>
{
   tempNode.foo();
} );

<snip>

void foo()
{
   foreach(myType bar in listOfmyType)
   {
       if (bar.isActive)
           setNodeActive();
   }
} 
+2  A: 

I think the ideal method would be for the model to detect the number of available cores N, partition the list of nodes into N segments, then spawn N threads, giving each thread a separate partition of the list.

Which is exactly what Parallel.ForEach does, so there must be another problem.

It's going to be very hard to come up with a better (Thread-management) system yourself. But you can use custom schedulers in the Task Library.

Henk Holterman
Hm, I was afraid of that. Thanks for the information.
Mandelbrot
+3  A: 

See this thread, which discusses limiting the number of threads that Parallel.For uses to avoid memory starvation:

http://connect.microsoft.com/VisualStudio/feedback/details/534571/parallel-foreach-may-create-an-inordinate-number-of-threads

I would try setting ParallelOptions.MaxDegreeOfParallelism to about 500, and see what happens.

Robert Harvey
500 still seems awful high, can you explain? My estimate would be 10-20 (8 cores, little or no I/O).
Henk Holterman
Yes, I would have thought 8 or 10 would be a good limit. Why would 500 be a good starting estimate?
Mandelbrot
Because you said it worked fine with 500 nodes. It doesn't have to be 500. I have a Windows Service that executes long-running tasks in the background behind a web site. It uses a `ThreadPool` object. I ran some tests on it, and found that the ideal number of simultaneously executing threads is the number of cores in the machine, so YMMV.
Robert Harvey
Ah, that makes sense. Thanks!
Mandelbrot
Yes, the ideal for a purely CPU-bound system is to have N threads (N = #cores) running. But you need some room to compensate for waiting (I/O or locks) threads. So 8+ is your starting point. Don't forget, 1 thread = 1 MB
Henk Holterman