views:

2523

answers:

14

I'm working on a program that processes many requests, none of them reaching more than 50% of CPU (currently I'm working on a dual core). So I created a thread for each request, the whole process is faster. Processing 9 requests, a single thread lasts 02min08s, while with 3 threads working simultaneously the time decreased to 01min37s, but it keeps not using 100% CPU, only around 50%.

How could I allow my program to use full processors capability?

EDIT The application isn't IO or Memory bounded, they're at reasonable levels all the time.

I think it has something to do with the 'dual core' thing.

There is a locked method invocation that every request uses, but it is really fast, i don't think this is the problem.

The more cpu-costly part of my code is the call of a dll via COM (the same external method is called from all threads). This dll is also no Memory or IO-bounded, it is an AI recognition component, I'm doing an OCR recognition of paychecks, a paycheck for request.

EDIT2

It is very probable that the STA COM Method is my problem, I contacted the component owners in order to solve this problem, thanks you all ;)

+12  A: 

It is probably no longer the processor that is the bottleneck for completing your process. The bottleneck has likely moved to disk access, network access or memory access. You could also have a situation where your threads are competing for locks.

Only you know exactly what your threads are doing, so you need to look at them with the above in mind.

Rob Prouse
+1 for lock contention.
Zan Lynx
it could be a reason, I'll check it, thanks ;)
Victor Rodrigues
+3  A: 

It depends what your program does - the work carried out by your concurrent Requests could be IO-bound - limited by the speed of (eg) your hard disk - rather than CPU bound, when you would see your CPU hit 100%.

After the edit, it does sound like COM STA objects might be the culprit.

Do all threads call the same instance of the COM object? Would it be possible to make your worker thread STA threads, and create a separate instance of the COM object on each thread. In this way it might be possible to avoid the STA bottleneck.

To tell if a COM coclass is STA:

class Test
{
  static void Main() //This will be an MTA thread by default
  {
    var o = new COMObjectClass();
    // Did a new thread pop into existence when that line was executed?
    // If so, .NET created an STA thread for it to live in.
  }
}
mackenir
Yes confuzation, they're all calling the same instance, I'll try creating an instance per thread, thanks.
Victor Rodrigues
I tried loading an instance for each thread, it resulted an IO bound situation.
Victor Rodrigues
Before this change, it took around 2min to run, after it, more than 3min.
Victor Rodrigues
Depends on what that COM object does, I suppose.
mackenir
A: 

It sounds like your application's performance may not be 'bound' by the amount of cpu resources available. If you're processing requests over the network, the cpu(s) may be waiting for the data to arrive, or for the network device to transfer the data. Alternatively, if you need to look up data to fulfill the request, the cpu may be waiting for the disk.

Dana the Sane
A: 

Are you sure that your tasks require intensive processor activity? Is there any IO processing? This can be the reason for your 50% load.

Test: Try using only 2 threads and set he affinity of each thread for each Core. Then open task manager and watch the load of both cores.

bruno conde
How can I see each thread is linked to each core?
Victor Rodrigues
There's very few IO processing, some few KB.
Victor Rodrigues
I guess I was wrong :( There is no managed code to do this and the unmanaged code I found seems to have problems. sorry
bruno conde
+3  A: 

What are these requests actually doing? Have you used a profiler to see where they're spending their time?

Jon Skeet
A: 

This isn't an answer really, but have you checked perfmon to see what resources it is using and have you run profilers on the code to see where it is spending time?

How have you determined that IO or other non CPU resources are not the bottleneck?

Can you give a brief description of what the threads are doing?

Tim
+21  A: 

Do you have significant locking within your application? If the threads are waiting for each other a lot, that could easily explain it.

Other than that (and the other answers given), it's very hard to guess, really. A profiler is your friend...

EDIT: Okay, given the comments below, I think we're onto something:

The more cpu-costly part of my code is the call of a dll via COM (the same external method is called from all threads).

Is the COM method running in an STA by any chance? If so, it'll only use one thread, serializing calls. I strongly suspect that's the key to it. It's similar to having a lock around that method call (not quite the same, admittedly).

Jon Skeet
there is a locked method invocation that every request uses, but it is really fast, i don't think this is the problem.
Victor Rodrigues
The more cpu-costly part of my code is the call of a dll via COM (the same external method is called from all threads). This dll is also no Memory or IO-bounded.
Victor Rodrigues
I thought all COM was STA?
Rob Prouse
No, you can have COM objects that can be called from multiple threads (MTA).
mackenir
I second the STA diagnosis. It sounds very likely.
Stu Mackellar
Unfortunately I don't know Jon, in fact this is the first time I have to access a non .net Dll from a .net project via COM. How could I check it / change it?
Victor Rodrigues
I don't rightly know how you'd check it, to be honest - try the properties in explorer to start with. As for changing it - you can't; if it's been designed as STA, it may be unsafe to change it. You'd have to ask the original authors.
Jon Skeet
A: 

if your process is running on cpu 0 and spawning threads there, the maximum it will ever reach is 50%. See if you have threads running on both cores or on just one. I would venture to guess you're isolated to a single core, or that one of your dependent resources is locked on a single core. If it hits exactly 50% then a single core is very likely to be your bottleneck.

Chris Ballance
I had a previous code on this project at the main thread, it made the project take ~100% of processing, it was a code resulting in a while-true condition. Of course I fixed the code, because it was consuming resources and was wrong, but it showed me the project can run at 100%
Victor Rodrigues
+12  A: 

The problem is the COM object.

Most COM objects run in the context of a 'single-threaded apartment'. (You may have seen a [STAThread] annotation on the main method of a .NET application from time to time?)

Effectively this means that all dispatches to that object are handled by a single thread. Throwing more cores at the problem just gives you more resources that can sit around and wait or do other things in .NET.

You might want to take a look at this article from Joe Duffy (the head parallel .NET guy at Microsoft) on the topic.

http://www.bluebytesoftware.com/blog/PermaLink,guid,8c2fed10-75b2-416b-aabc-c18ce8fe2ed4.aspx

In practice if you have to do a bunch of things against a single COM object like this you are hosed, because .NET will just serialize access patterns internally behind your back. If you can create multiple COM objects and use them then you can resolve the issue because each can be created and accessed from a distinct STA thread. This will work until you hit about 100 STA threads, then things will go wonky. For details, see the article.

Edward Kmett
This is one of the joyous things I found with some of the older PDF libraries.
StingyJack
Its also the reason why you don't dare invoke the various Excel.Application or Office Web Components on the web server. All of a sudden as you cross 100 threads they start to flip out and share globals and destroy one another from the wrong thread, etc.
Edward Kmett
A: 

So you solved the problem of using a single COM object and now have an IO problem.

The increased run time for multiple threads is probably because of mixing random IO together, which will slow it all down.

If the data set will fit into RAM, try to see if you can prefetch it into cache. Perhaps just reading the data, or maybe memory mapping it together with a command to make it available.

This is why SQL databases will often choose sequential table scan over an index scan on queries you wouldn't expect: it can be much faster to read all of it in order than to read it in random chunks.

Zan Lynx
A: 

Maybe I'm misunderstanding something, but you said none of your requests (each in a separate thread) reaches 100% CPU.

What operating system are you using?

I seem to vaguely recall that in old versions of windows (e.g., early XPs and 2000s), CPU utilization was considered from total of two processors, so a single thread wasn't able to make it past 50% unless it was the idle process..

Uri
I am using Windows XP SP2.. But I could actually reach 100% when I had a while-true situation at the 'main' thread. It is very likely the COM STA thing is my problem, I contacted the component owners ;)
Victor Rodrigues
A: 

One more note, have you tried launching your code not from Visual Studio (regardless of release / debug settings) ?

Tomas Pajonk
I ran at Debug mode on VS05, and the its binaries from explorer.
Victor Rodrigues
A: 

Hey guys,

The problem is the COM object. It is STA, and I can't either have two instances running concurrently on the same process. When I create an instance for the COM class, the other becomes unusable.

I've contacted the component developers, they're thinking what they can do for me.

Thanks you all ;)

Victor Rodrigues
A: 

I think I had a similar problem. I was creating multiple threads in c# that ran c++ code through a COM interface. My dual core CPU never reached 100%.

After reading this post, I almost gave up. Then I tried calling SetApartmentState(ApartmentState.STA) on my Threads.

After only changing this, the CPU maxed out.