More threads, better performance?

views:

568

answers:

+5 Q:

More threads, better performance?

When I write a message driven app. much like a standard windows app only that it extensively uses messaging for internal operations, what would be the best approach regarding to threading?

As I see it, there are basically three approaches (if you have any other setup in mind, please share):

Having a single thread process all of the messages.
Having separate threads for separate message types (General, UI, Networking, etc...)
Having multiple threads that share and process a single message queue.

So, would there be any significant performance differences between the three? Here are some general thoughts: Obviously, the last two options benefit from a situation where there's more than one processor. Plus, if any thread is waiting for an external event, other threads can still process unrelated messages. But ignoring that, seems that multiple threads only add overhead (Thread switches, not to mention more complicated sync situations).

And another question: Would you recommend to implement such a system upon the standard Windows messaging system, or to implement a separate queue mechanism, and why?

+3 A:

We can't tell you much for sure without knowing the workload (ie, the statistical distribution of events over time) but in general

single queue with multiple servers is at least as fast, and usually faster, so 1,3 would be preferable to 2.
multiple threads in most languages add complexity because of the need to avoid contention and multiple-writer problems
long duration processes can block processing for other things that could get done quicker.

So horseback guess is that having a single event queue, with several server threads taking events off the queue, might be a little faster.

Make sure you use a thread-safe data structure for the queue.

Charlie Martin 2009-04-21 14:31:09

A good place to start is to ask yourself why you need multiple threads.

The well-thought-out answer to this question will lead you to the best answer to the subsequent question, "how should I use multiple threads in my application?"

And that must be a subsequent question; not a primary question. The fist question must be why, not how.

John Dibling 2009-04-21 14:32:52

The concern is already expressed - performance. If he wanted threads to get better X, he'd have asked which solution offered the best X.

MSalters 2009-04-21 14:46:31

Performance doing what?

John Dibling 2009-04-21 15:59:23

If you don't know what specifically you hope to improve, then you end up just throwing threads at your application blindly, hoping it will somehow make things faster. But it doesn't work that way, and you can easily make your application run slower and more prone to defects if you don't really have a goal or a problem you are trying to solve.

John Dibling 2009-04-21 16:01:45

I think it depends on how long each thread will be running. Does each message take the same amount of time to process? Or will certain messages take a few seconds for example. If I knew that Message A was going to take 10 seconds to complete I would definitely use a new thread because why would I want to hold up the queue for a long running thread...

My 2 cents.

Peter D 2009-04-21 14:33:29

+1 A:

In general, don't worry about the overhead of threads. It's not going to be an issue if you're talking about merely a handful of them. Race conditions, deadlocks, and contention are a bigger concern, and if you don't know what I'm talking about, you have a lot of reading to do before you tackle this.

I'd go with option 3, using whatever abstractions my language of choice offers.

Adam Jaskiewicz 2009-04-21 14:34:53

+3 A:

It all depends.

For example:

Events in a GUI queue are best done by a single thread as there is an implied order in the events thus they need to be done serially. Which is why most GUI apps have a single thread to handle events, though potentially multiple events to create them (and it does not preclude the event thread from creating a job and handling it off to a worker pool (see below)).
Events on a socket can potentially by done in parallel (assuming HTTP) as each request is stateless and can thus by done independently (OK I know that is over simplifying HTTP).
Work Jobs were each job is independent and placed on queue. This is the classic case of using a set of worker threads. Each thread does a potentially long operation independently of the other threads. On completion comes back to the queue for another job.

Martin York 2009-04-21 14:57:31

I agree, the complexity of a problem solved using the wrong model will probably have more impact on performances than the actual efficiency of the underlying mechanism.

total 2009-04-21 15:18:39

+7 A:

The specific choice of threading model should be driven by the nature of the problem you are trying to solve. There isn't necessarily a single "correct" approach to designing the threading model for such an application. However, if we adopt the following assumptions:

messages arrive frequently
messages are independent and don't rely too heavily on shared resources
it is desirable to respond to an arriving message as quickly as possible
you want the app to scale well across processing architectures (i.e. multicode/multi-cpu systems)
scalability is the key design requirement (e.g. more message at a faster rate)
resilience to thread failure / long operations is desirable

In my experience, the most effective threading architecture would be to employ a thread pool. All messages arrive on a single queue, multiple threads wait on the queue and process messages as they arrive. A thread pool implementation can model all three thread-distribution examples you have.

#1 Single thread processes all messages => thread pool with only one thread

#2 Thread per N message types => thread pool with N threads, each thread peeks at the queue to find appropriate message types

#3 Multiple threads for all messages => thread pool with multiple threads

The benefits of this design is that you can scale the number of threads in the thread in proportion to the processing environment or the message load. The number of threads can even scale at runtime to adapt to the realtime message load being experienced.

There are many good thread pooling libraries available for most platforms, including .NET, C++/STL, Java, etc.

As to your second question, whether to use standard windows message dispatch mechanism. This mechanism comes with significant overhead and is really only intended for pumping messages through an windows application's UI loop. Unless this is the problem you are trying to solve, I would advise against using it as a general message dispatching solution. Furthermore, windows messages carry very little data - it is not an object-based model. Each windows message has a code, and a 32-bit parameter. This may not be enough to base a clean messaging model on. Finally, the windows message queue is not design to handle cases like queue saturation, thread starvation, or message re-queuing; these are cases that often arise in implementing a decent message queing solution.

LBushkin 2009-04-21 15:04:31

+1: "The specific choice of threading model should be driven by the nature of the problem you are trying to solve." Thank you. People seem not to get this, and tend to think that if they add threads the app will go faster.

John Dibling 2009-04-21 16:04:26

Note that there are two different performance goals, and you haven't stated which you are targetting: throughput and responsiveness.

If you're writing a GUI app, the UI needs to be responsive. You don't care how many clicks per second you can process, but you do care about showing some response within a 10th of a second or so (ideally less). This is one of the reasons it's best to have a single thread devoted to handling the GUI (other reasons have been mentioned in other answers). The GUI thread needs to basically convert windows messages into work-items and let your worker queue handle the heavy work. Once the worker is done, it notifies the GUI thread, which then updates the display to reflect any changes. It does things like painting a window, but not rendering the data to be displayed. This gives the app a quick "snapiness" that is what most users want when they talk about performance. They don't care if it takes 15 seconds to do something hard, as long as when they click on a button or a menu, it reacts instantly.

The other performance characteristic is throughput. This is the number of jobs you can process in a specific amount of time. Usually this type of performance tuning is only needed on server type applications, or other heavy-duty processing. This measures how many webpages can be served up in an hour, or how long it takes to render a DVD. For these sort of jobs, you want to have 1 active thread per CPU. Fewer than that, and you're going to be wasting idle clock cycles. More than that, and the threads will be competing for CPU time and tripping over each other. Take a look at the second graph in this article DDJ articles for the trade-off you're dealing with. Note that the ideal thread count is higher than the number of available CPUs due to things like blocking and locking. The key is the number of active threads.

Eclipse 2009-04-21 15:23:11

I think option 2 is the best. Having each thread doing independant tasks would give you best results. 3rd approach can cause more delays if multiple threads are doing some I/O operation like disk reads, reading common sockets and so on.

Whether to use Windows messaging framework for processing requests depends on the work load each thread would have. I think windows restricts the no. of messages that can be queued at the most to 10000. For most of the cases this should not be an issue. But if you have lots of messages to be queued this might be some thing to take into consideration.

Seperate queue gives a better control in a sense that you may reorder it the way you want (may be depending on priority)

Canopus 2009-04-21 15:27:52

Yes, there will be performance differences between your choices.

(1) introduces a bottle-neck for message processing
(3) introduces locking contention because you'll need to synchronize access to your shared queue.

(2) is starting to go in the right direction... though a queue for each message type is a little extreme. I'd probably recommend starting with a queue for each model in your app and adding queues where it makes since to do so for improved performance.

If you like option #2, it sounds like you would be interested in implementing a SEDA architecture. It is going to take some reading to understand what is going on, but I think the architecture fits well with your line of thinking.

BTW, Yield is a good C++/Python hybrid implementation.

ceretullis 2009-04-21 18:07:49

I'd have a thread pool servicing the message queue, and make the number of threads in the pool easily configurable (perhaps even at runtime). Then test it out with expected load.

That way you can see what the actual correlation is - and if your initial assumptions change, you can easily change your approach.

A more sophisticated approach would be for the system to introspect its own performance traits and adapt it's use of resources, threads in particular, as it goes. Probably overkill for most custom application code, but I'm sure there are products that do that out there.

As for the windows events question - I think that's probably an application specific question that there is no right or wrong answer to in the general case. That said, I usually implement my own queue as I can tailor it to the specific characteristics of the task at hand. Sometimes that might involve routing events via the windows message queue.

Phil Nash 2009-06-21 22:20:29

ansaurus

tags:

views:

answers:

More threads, better performance?

related questions