views:

229

answers:

6

I want to optimize my application number of threads. Almost all of them have IO beside CPU usage in an equal value. How much is the efficient number of threads when there are no other applications running in system. I want the answer for Windows and under JVM.

+10  A: 

There is no per-OS answer. This will depend on the specific set of tasks that your code is performing. You should benchmark your application with different configurations to see which one is the most performant.

Some general tips about multithreading:

  • You can't speed up like tasks with more threads; the exception is that if you have multiple CPUs, you can parallelize compute tasks with one thread per CPU, provided this logic can be split up such that it does not necessarily need to be executed serially. A good example for this would be a divide-and-conquer problem like mergesort, where the two halves can be sorted in any order.

  • You can achieve some speedup by parallelizing tasks that do not make use of the same part of the machine. So, given that you say you have "equal value" of I/O and compute tasks, you will want to separate those into different threads - again, this assumes that ordering is not important.

If it is the case (as with many applications) that threads perform some compute logic followed by some I/O (like writing the data to disk or a database server, for example) then it will be very difficult to come up with some formula to determine the exact number of threads you should have, as this will be highly dependent on the data you are processing, how you are processing it, and what you are doing with it when processing is done. This is why the best thing to do is have a configurable thread pool whose size can be adjusted easily - then run some load tests with different sizes and see which one performed best.

danben
Any statistics?
Shayan
Yes, you have to provide those statistics. It'll be unique to your application.
Andrew Coleson
+6  A: 

The Java Concurrency in Practice book gives a rough formula for sizing a thread pool to keep your CPUs pegged at a certain utilization:

N = number of CPUs

U = target CPU utilization, 0 <= U <= 1

W / C = ration of wait time to compute time

The optimal pool size (number of threads) for keeping processors at the desired utilization is:

PoolSize = N * U * (1 + (W/C))

This is just for CPU utilization though.

You can get the available processors with Runtime.getRuntime().availableProcessors()

Kevin
+1 this rule of thumb if probably the best without knowing even requirements
stacker
+2  A: 

I don't think there's a definitive answer to this. I would just suggest trying your application with different numbers of threads and seeing which performs the best. One place to start would be is one more thread than the number of processor threads in your hardware, e.g. if you have a dual core processor with one thread per core then use 3 threads.

Poindexter
A: 

There is really no universal answer to this. The number of threads you spawn is a function of how many tasks you are doing, how they communicate and how you design your application. I have had extremely large apps that have only one thread that runs fine. On the other hand I have also had small applications that mandated multiple threads for performance sake.

(Sorry for any spelling/formatting issues, I typed this off of my phone)

Luke Cycon
+1  A: 

I found that the best way to deal with this is not to use threads directly but to use the Executor framework. You can experiment with the different configurations but I found I liked CallerRunsPolicy.

+1  A: 

Performance is far from the only reason to use threads.

Basically, any multi-thread program can be simulated with a single more complex thread, so what the threads are actually doing is simplifying your code, not necessarily making it faster.

That said, if your app could make use of multiple cores or multiple disk heads functioning at the same time, then threads could make it easy to exploit that. In that case, you probably don't need any more threads than you have separate cores or heads, because process switching has a definite cost.

Mike Dunlavey