views:

241

answers:

3

I have an application which is CPU intensive. When the data is processed on a single thread, the CPU usage goes to 100% for many minutes. So the performance of the application appears to be bound by the CPU. I have multithreaded the logic of the application, which result in an increase of the overall performance. However, the CPU usage hardly goes above 30%-50%. I would expect the CPU (and the many cores) to go to 100% since I process many set of data at the same time.

Below is a simplified example of the logic I use to start the threads. When I run this example, the CPU goes to 100% (on an 8/16 cores machine). However, my application which uses the same pattern doesn’t.

public class DataExecutionContext
{
    public int Counter { get; set; }

    // Arrays of data
}

static void Main(string[] args)
{
    // Load data from the database into the context
    var contexts = new List<DataExecutionContext>(100);
    for (int i = 0; i < 100; i++)
    {
        contexts.Add(new DataExecutionContext());
    }

    // Data loaded. Start to process.
    var latch = new CountdownEvent(contexts.Count);
    var processData = new Action<DataExecutionContext>(c =>
    {
        // The thread doesn't access data from a DB, file, 
        // network, etc. It reads and write data in RAM only 
        // (in its context).
        for (int i = 0; i < 100000000; i++)
            c.Counter++;
    });

    foreach (var context in contexts)
    {
        processData.BeginInvoke(context, new AsyncCallback(ar =>
        {
            latch.Signal();
        }), null);
    }

    latch.Wait();
}

I have reduced the number of locks to the strict minimum (only the latch is locking). The best way I found was to create a context in which a thread can read/write in memory. Contexts are not shared among other threads. The threads can’t access the database, files or network. In other words, I profiled my application and I didn’t find any bottleneck.

Why the CPU usage of my application doesn’t go about 50%? Is it the pattern I use? Should I create my own thread instead of using the .Net thread pool? Is there any gotchas? Is there any tool that you could recommend me to find my issue?

Thanks!

+2  A: 

This is speculationg without seeing your application, but if your application is doing any processing dealing with files, databases, creating lots of objects(requesting memory), working with network devices, or hardware devices of any sort, then those factors might be limiting your application from reaching 100% cpu usage. This combined with thread switching might also be a factor.

You say you are using the pattern of the example you gave, but you say the example reaches 100% utilization but your application does not. So there is some difference there, and you should try to describe in more detail what your application is doing. 50% utilization is not bad. Many applications run at 50% on hyper-threaded Intel CPUs and they still run fine. If the application is not reaching 100% cpu utilization and you are still getting good performance, then I would say that is actually a good thing, because it means you have some head room since it is not CPU bound anymore. This means cases where other things might take up CPU time your application won't be affected as much. If it were at 100% utilization then you would see the applications performance waver when other processes activily use the CPU.

AaronLS
+4  A: 

There are many things that could, potentially, cause this behavior.

First, what type of CPU do you have? If you have an i7 or similar processor, the OS will see this as having 8 cores, when in reality, it has 4 cores with 2 hyperthreads/core. For most operations, hyperthreading does not really provide the same scalability as a second core, even though the OS sees it this way. I've had this cause my overall CPU usage to appear lower to the OS...

Second, it's possible you have some form of true sharing occurring. You mention that you have locking - even if it's kept to a minimum, the locks may be preventing you from scheduling this effectively.

Also, right now, you're scheduling all 100 work items, right up front. The os is going to have to page in and out those 100 threads. You may want to restrict this to only allowing a certain number to process at a given time. This is much easier using the new Task Parallel Library (just use Parallel.ForEach with a ParallelOptions setup to have a maximum number of threads) - but can be done on your own.

Given that you're scheduling all 100 items to process simulataneously, the paging may be hampering the ability to get maximum throughput.

Also, if you're doing any other "more real" work - you may be getting false sharing issues, especially if you're working with arrays or collections that are shared (even if the elements you're process are not shared).

I'd recommend running this under the concurrency profiler in VS 2010 - it will give you a much clearer picture of what is happening.

Reed Copsey
I tried on many machines: Core Duo (usage ~80-90%), i7 (usage ~50%), Dual Xeon L5520 (usage ~40-50%).
Martin
Since you're at 80-90% on the core duo, and lower on the others, it sounds like you have either false or true sharing issues. False sharing may be occurring if you're tryign to use data that's located too tightly in memory, from multiple threads - true sharing is due to locking...
Reed Copsey
I found that article on MSDN about False Sharing (http://msdn.microsoft.com/en-us/magazine/cc872851.aspx) and it looks like it could be my problem... I am not sure to entirely understand False Sharing yet. How did you figure this out?
Martin
I'd watch Igor Ostrovky's PDC session: http://igoro.com/archive/video-of-my-plinq-session-at-pdc-2009/ He describes false sharing in detail, with a great example, as well as talks about how to work around it.
Reed Copsey
A: 

If you are making lots of small memory allocations - managed heap can become a shared resource that blocks the threads and slows down the process and thus CPU usage

Alexander Bartosh