views:

178

answers:

5

I have a multithreaded application. Each module is executed in a separate thread. Modules are:

- network module - used to receive/send data from network
- parser module - encode/decode network data to internal presentation
- 2 application module - perform some application logic on the above data one after other
- counter module - used to gather statistics from other modules
- timer module - used to schedule timers
- and much more ...

All threads using message queues for inter thread communication (std::deque sync by conditional variable and mutex).

Some modules are used by others ones (e.g. all modules use timer and counter) and this for each message received from network wich should be handled in very high rates.

This is pretty complex application and the design looks "reasonable". From other hand, I'm not sure that such design, thread per module, is the "best" one? In particular, I'm afraid that such design "encorage" a lot of context switches.

What do you think?

Is there're any good guidelines or open source project to learn from how to do "correct" design of threaded application?

+1  A: 

A good guideline is to put operations that might block (such as I/O) in its own thread. Your network module is a definite candidate here. Have your network thread use select (I assume UNIX here) to block on input.

Asynchronous events are good in separate threads as well. Your timer module looks like a good candidate here.

You might want to put your other modules in one thread to decrease complexity of your application. BUT, you might want to split them up if you have a multi-processor system.

Have a good strategy for locking resources and mutex handling to prevent deadlocks. A dependency graph (using a whiteboard!) might help here to get your design correct.

Good luck! Sounds like a complex system which will cause many hours of fun development!

Starkey
+13  A: 

Thread-per-function designs are just naive: they assume that by separating tasks - by module - onto threads, that some kind of scalability will be achieved.

This kind of design is inefficient, as very few task breakdowns yield exactly as many tasks as there are CPUs.

Far more rational designs are to break tasks down into 'jobs' - and then use thread pooling mechanisms to dispatch those jobs. Advantages over the thread-per-module approach:

  • Thread pools take advantage of all cores. with thread-per-module if you have modules < cores you have cores sitting idle.

  • Thread pools minimize contention and resources by maintaining a parity between active threads, and cores. with thread-per-module, if modules > cores you incur needless extra context switches and (on some platforms) each thread exhausts other limited per process resources (like virtual memory).

  • Thread pools let a "module" do multiple jobs at a time. thread-per-module means that the busiest module still only gets one core.

Chris Becke
for thread pool vs. thread per module comparison
dimba
+1  A: 

I wouldn't call myself an expert an multi-threaded design. But I've at least worked with threads enough to have run into various issues trying to design them to work together (communication, locking resources, waiting for threads to end, etc).

At this point, my general rule of thumb is that I must justify the existence of each new thread. For example, if the network layer I'm using provides both a synchronous and an asynchronous API, can I really justify making the network code use synchronous calls in a new thread instead of just using the asynchronous calls in the main thread? In your case, how many modules actually need a thread of their own for a specific reason. Are there any that could instead just be called in turn from the main thread?

If some threads have no good reason for existing, then you might be able to save yourself some trouble and complexity by just putting that module in the main thread.

Now of course, there are good justifiable reasons for putting things in threads. Such as making synchronous calls that may block for a long time, keeping a GUI thread responsive while performing a long task, or being able to take advantage of parallel processing of a large task on a multi-core system.

I don't know of any particular "correct" way to do it. A lot of it really comes down to the details of what your application is actually supposed to do.

TheUndeadFish
A: 

For what platform?

For instance a Win32 applications the best model for back-end servers (like yours seems to be) is the thread pool and IO Completion Port. This is not just some hear say and opinion, there are strong facts behind this claim. Rick Vicik of the Windows Performance team has posted a series of articles describing in greater detail why high end servers need to follow this model, see High Performance Windows Programs.

There are other factors that come into play, like for instance the typo of protocol your network module has to handle. Request-Response protocols are often handled by one-thread-per-request metaphor and they do well enough, but high-throughput high-scale protocols don't fare well in that model, specifically because of boxcaring requirements.

Ultimately, whether your design is sound or not is hard to tell just from this brief description. Personally I tend o favor an IO completion driven threading model, as opposed to logical-module driven one, but that's just me.

Remus Rusanu
A: 

Just to add to the other answers, lets reason every single thread in your dessign:

  • network module

Accepted.

  • parser module + 2 application module

Are you sure that these 3 threads can't be merged into one, main data processing thread? If that were the case, you could then benefit of a thread pool like others sugested, having this processing performed by N threads.

  • timer module

This one probably is reasonable in most platforms, as you will need a message processing loop to dispatch timer events. Also, if you ever need a GUI that could be the place.

  • counter module

This is the one that most annoys me. I can't find the reason for having a separate thread for this. Depending on how much you increment it, it will be a nice bottleneck for the application.

I'll suggest keeping separate counters in each thread and poll(message queue) for them when you need it.

  • and much more ...

Hope not!

asr