ansaurus

Question

How can I most effectively take advantage of multiple cores for short computations in .NET?

Answer 1

+14 A:

If you want to do a lot with parallel operations, you're going to want to start with .Net 4.0. Here's the Parallel Programming for .Net documentation. You'll want to start here though. .Net 4.0 adds a LOT in terms of multi-core utilization. Here's a quick example:

Current 3.5 Serial method:

for(int i = 0; i < 30000; i++)
{
  doSomething(i);
}

New .Net 4.0 Parallel method:

Parallel.For(0, 30000, (i) => doSomething(i));

The Parallel.For method automatically scales across the number of cores available, you can see how fast you could start taking advantage of this. There are dozens of new libraries in the framework, supporting full thread/task management like your example as well (Including all the piping for syncing, cancellation, etc).

There are libraries for Parallel LINQ (PLINQ), Task Factories, Task Schedulers and a few others. In short, for the specific task you laid out .Net 4.0 has huge benefits for you, and I'd go ahead and grab the free beta 2 (RC coming soon) and get started. (No, I don't work for Microsoft...but rarely do I see an upcoming release fulfill a need so perfectly, so I highly recommend .Net 4.0 for you)

Nick Craver 2010-02-07 06:31:43

If you can't use .NET 4.0 for whatever reason, have a look at ThreadPool

Johannes Rudolph 2010-02-07 07:37:48

That's pretty great stuff. I downloaded VS 2010, and I have installed .NET 4.0. I am however wondering how to use .NET 4.0 from VS 2008 (which I have a commercial version of).

cdiggins 2010-02-07 14:10:00

@cdigging - In short...you can't, It's like when VS2003 was .Net 1.1, VS2005 was .Net 2.0, .Net CLR releases are always (at least to this point) tied to a new visual studio. The RTM version of VS 2010 is due out April 12th, RC version should be out sometime this month.

Nick Craver 2010-02-07 14:15:20

Answer 2

+5 A:

Because I didn't want to develop using VS 2010, and I found that ThreadPool didn't have optimal performance for distributing work across cores (I think because it started/stopped too many threads) I ended up rolling my own. Hope that others find this useful:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;

namespace HeronEngine
{
    /// <summary>
    /// Represents a work item.
    /// </summary>
    public delegate void Task();

    /// <summary>
    /// This class is intended to efficiently distribute work 
    /// across the number of cores. 
    /// </summary>
    public static class Parallelizer 
    {
        /// <summary>
        /// List of tasks that haven't been yet acquired by a thread 
        /// </summary>
        static List<Task> allTasks = new List<Task>();

        /// <summary>
        /// List of threads. Should be one per core. 
        /// </summary>
        static List<Thread> threads = new List<Thread>();

        /// <summary>
        /// When set signals that there is more work to be done
        /// </summary>
        static ManualResetEvent signal = new ManualResetEvent(false);

        /// <summary>
        /// Used to tell threads to stop working.
        /// </summary>
        static bool shuttingDown = false;

        /// <summary>
        /// Creates a number of high-priority threads for performing 
        /// work. The hope is that the OS will assign each thread to 
        /// a separate core.
        /// </summary>
        /// <param name="cores"></param>
        public static void Initialize(int cores)
        {
            for (int i = 0; i < cores; ++i)
            {
                Thread t = new Thread(ThreadMain);
                // This system is not designed to play well with others
                t.Priority = ThreadPriority.Highest;
                threads.Add(t);
                t.Start();
            }
        }

        /// <summary>
        /// Indicates to all threads that there is work
        /// to be done.
        /// </summary>
        public static void ReleaseThreads()
        {
            signal.Set();
        }

        /// <summary>
        /// Used to indicate that there is no more work 
        /// to be done, by unsetting the signal. Note: 
        /// will not work if shutting down.
        /// </summary>
        public static void BlockThreads()
        {
            if (!shuttingDown)
                signal.Reset();
        }

        /// <summary>
        /// Returns any tasks queued up to perform, 
        /// or NULL if there is no work. It will reset
        /// the global signal effectively blocking all threads
        /// if there is no more work to be done.
        /// </summary>
        /// <returns></returns>
        public static Task GetTask()
        {
            lock (allTasks)
            {
                if (allTasks.Count == 0)
                {
                    BlockThreads();
                    return null;
                }
                Task t = allTasks.Peek();
                allTasks.Pop();
                return t;
            }
        }

        /// <summary>
        /// Primary function for each thread
        /// </summary>
        public static void ThreadMain()
        {
            while (!shuttingDown)
            {
                // Wait until work is available
                signal.WaitOne();

                // Get an available task
                Task task = GetTask();

                // Note a task might still be null becaue
                // another thread might have gotten to it first
                while (task != null)
                {
                    // Do the work
                    task();

                    // Get the next task
                    task = GetTask();
                }
            }
        }

        /// <summary>
        /// Distributes work across a number of threads equivalent to the number 
        /// of cores. All tasks will be run on the available cores. 
        /// </summary>
        /// <param name="localTasks"></param>
        public static void DistributeWork(List<Task> localTasks)
        {
            // Create a list of handles indicating what the main thread should wait for
            WaitHandle[] handles = new WaitHandle[localTasks.Count];

            lock (allTasks)
            {
                // Iterate over the list of localTasks, creating a new task that 
                // will signal when it is done.
                for (int i = 0; i < localTasks.Count; ++i)
                {
                    Task t = localTasks[i];

                    // Create an event used to signal that the task is complete
                    ManualResetEvent e = new ManualResetEvent(false);

                    // Create a new signaling task and add it to the list
                    Task signalingTask = () => { t(); e.Set(); };
                    allTasks.Add(signalingTask);

                    // Set the corresponding wait handler 
                    handles[i] = e;
                }
            }

            // Signal to waiting threads that there is work
            ReleaseThreads();

            // Wait until all of the designated work items are completed.
            Semaphore.WaitAll(handles);
        }

        /// <summary>
        /// Indicate to the system that the threads should terminate
        /// and unblock them.
        /// </summary>
        public static void CleanUp()
        {
            shuttingDown = true;
            ReleaseThreads();
        }
    }    
}

cdiggins 2010-02-07 21:02:19

Have you tried benchmarking your implementation against the build-in ThreadPool?

Jan 2010-02-08 12:38:04

Yes, but only in the context of my work. It appears to be a bit faster, but don't trust my benchmarks. I'm too biased.

cdiggins 2010-02-08 14:17:38

Make sure to call .Close() on your ManualResetEvents - http://stackoverflow.com/questions/2234128/do-i-need-to-call-close-on-a-manualresetevent

Kevin Hakanson 2010-02-11 18:44:15

What are these Peek() and Pop() methods you're attempting to call on your List<Task> instance? It would seem to me that Peek() and Pop() are Stack<Task> methods, but then you are also calling Add() on it as well.

Jesse C. Slicer 2010-02-12 00:54:07

Christopher, check your email :)

Jesse C. Slicer 2010-02-12 01:47:32

Answer 3

+2 A:

I would go with thread pool even though it has its problems, MS is investing in improving it and it seems like .NET 4 will have an improved one. At this point, I think that the best thing would be to use the thread pool wrapped in your own object and wait with deciding about your own implementation

Natza Mitzi 2010-02-15 08:20:46

ansaurus

tags:

views:

answers:

How can I most effectively take advantage of multiple cores for short computations in .NET?

related questions