views:

509

answers:

6

I am working on maintaining someone else's code that is using multithreading, via two methods:

1: ThreadPool.QueueUserWorkItem(New WaitCallback(AddressOf ReadData), objUpdateItem)

2: Dim aThread As New Thread(AddressOf LoadCache)
   aThread.Start()

However, on a dual core machine, I am only getting 50% CPU utlilization, and on a dual core with hyperthreadin enabled machine, I am only getting 25% CPU utilization.

Obviously threading is extremely complicated, but this behaviour would seem to indicate that I am not understanding some simple fundamental fact?

UPDATE

The code is too horribly complex to post here unfortunately, but for reference purposes, here is roughly what happens....I have approx 500 Accounts whose data is loaded from the database into an in memory cache...each account is loaded individually, and that process first calls a long running stored procedure, followed by manipulation and caching of the returned data. So, the point of threading in this situation is that there is indeed a bottleneck hitting the database (ie: the thread will be idled for up to 30 seconds waiting for the query to return), so we thread to allow others to begin processing the data they have received from Oracle.

So, the main thread executes:

ThreadPool.QueueUserWorkItem(New WaitCallback(AddressOf ReadData), objUpdateItem)

Then, the ReadData() then proceeds to execute (exactly once):

Dim aThread As New Thread(AddressOf LoadCache)
aThread.Start()

And this is occurring in a recursive function, so the QueueUserWorkItem can be executing multiple times, which in turn then executes exactly one new thread via the aThread.Start

Hopefully that gives a decent idea of how things are happening.

So, under this scenario, should this not theoretically pin both cores, rather than maxing out at 100% on one core, while the other core is essentially idle?

+2  A: 

How many threads are you spinning up? It may seem primitive (wait a few years, and you won't need to do this anymore), but your code has got to figure out an optimal number of threads to start, and spin up that many. Simply running a single thread won't make things any faster, and won't pin a physical processor, though it may be good for other reasons (a worker thread to keep your UI responsive, for instance).

In many cases, you'll want to be running a number of threads equal to the number of logical cores available to you (available from Environment.ProcessorCount, I believe), but it may have some other basis. I've spun up a few dozen threads, talking to different hosts, when I've been bound by remote process latency, for instance.

Michael Petrotta
According to Joe Duffy, your code should never figure out an optimal number of threads to start. Instead, use ThreadPool which will make the best choice for you on the machine you are working. Environment.ProcessorCount looks like the right tool, but it is read right out of the environment which can be written to as well as read from.
plinth
Do you have a reference, plinth?
Michael Petrotta
WRT Environment.ProcessorCount, in spite of what the documentation describes, that property uses the kernel32 function GetSystemInfo to discover the number of logical processors.
Michael Petrotta
Reference: PDC, 2008. You might also see his book Concurrent Programming on Windows.
plinth
Hmmm. I understand where he's coming from, and ThreadPool is the right thing to use, but (assuming http://www.bluebytesoftware.com/blog/PermaLink,guid,ca22a5a8-a3c9-4ee8-9b41-667dbd7d2108.aspx is a good summary of his arguments), I think he's oversimplifying things. In particular, he seems to assume that spinning up a number of threads significantly greater than the number of cores present is a bad idea. It's not always. The bottleneck is not always processing horsepower; in fact, it may have nothing at all to do with the local machine.
Michael Petrotta
And in my case, many of my threads are idled, waiting for a response from the database, hence my desire to have multiple threads running concurrently.
tbone
+4  A: 

That code starts one thread that will go an do something. To get more than one core working you need to start more than one thread and get them both busy. Starting a thread to do some work, and then having your main thread wait for it won't get the task done any quicker. It is common to start a long running task on a background thread so that the UI remains responsive, which may be what this code was intended to do, but it won't make the task get done any quicker.

@Judah Himango - I had assumed that those two lines of code were samples of how multi-threading were being achieved in two different places in the program. Maybe the OP can clarify if this is the case or if these two lines really are in the one method. If they are part of one method then we will need to see what the two methods are actually doing.

Update:
That does sound like it should max out both cores. What do you mean by recursivly calling ReadData()? If each new thread is only calling ReadData at or near its end to start the next thread then that could explain the behaviour you are seeing.
I am not sure that this is actaully a good idea. If the stored proc takes 30 seconds to get the data then presumably it is placing a fair load on the database server. Running it 500 times in parallel is just going to make things worse. Obviously I don't know your database or data, but I would look at improving the performance of the stored proc.
If multi threading does look like the way forward, then I would have a loop on the main thread that calls ThreadPool.QueueUserWorkItem once for each account that needs loading. I would also remove the explicit thread creation and only use the thread pool. That way you are less likely to starve the local machine by creating too many threads.

pipTheGeek
The code starts 2 threads, right? One threadpool thread, another user-created thread.
Judah Himango
I added some additional information that *hopefully* illustrates what is happening.
tbone
A: 

The CPU behavior would indicate that the application is only utilizing one logical processor. 50% would be one proc out of 2 (proc+proc). 25% would be one logical processor out of 4 (proc + HT + proc + HT)

Troggy
+1  A: 

Multi-Threaded and Multi-Core are two different things. Doing things Multi-Threaded often won't offer you an enormous increase in performance, sometimes quite the opposite. The Operating System might do a few tricks to spread your cpu cycles over multiple cores, but that's where it ends.

What you are looking for is Parallelism. The .NET 4.0 framework will add a lot of new features to support Parallelism. Have a sneak-peak here:
http://www.danielmoth.com/Blog/2009/01/parallelising-loops-in-net-4.html

Zyphrax
A: 

How many threads to you have in total and do you have any locks in LoadCache. A SyncLock may a multi-thread system act as a single thread (by design). Also if your only spool one thread you will only get one worker thread.

Matthew Whited
A: 

CPU utilization is suggesting that you're only using one core; this may suggest that you've added threading to a portion where it is not beneficial (in this case, where CPU time is not a bottle neck).

If Loading the Cache or reading data happens very quickly, multi threading won't provide a massive improvement in speed performance. Similarly, if you're encountering a different bottleneck (slow bandwidth to a server, etc), it may not show up as CPU usage.

CoderTao