views:

122

answers:

4

Is there an algorithm that checks whether creating a new thread pays off performance wise? I'll set a maximum of threads that can be created anyway but if I add just one task it wouldn't be an advantage to start a new thread for that. The programming language I use is python.

Edit 1#

Can this question even be answered or is it to general because it depends on what the threads work on?

A: 

Testing will tell you.

Basicly try, and benchmark.

EKS
+3  A: 

Rule of thumb: if a thread is going to do input/output, it may be worth separating it.

If it's doing number-crunching then optimum number of threads is number of CPUs.

yu_sha
A: 

There is no general answer for this ... yet. But there is a trend. Since computers get more and more CPU cores (try to buy a new single CPU PC ...), using threads becomes the de-facto standard.

So if you have some work which can be parallelized, then by all means use the threading module and create a pool of workers. That won't make your code much slower in the worst case (just one thread) but it can make it much faster if a user has a more powerful PC.

In the worst case, your program will complete less than 100ms later -> people won't notice the slowdown but they will notice the speedup with 8 cores.

Aaron Digulla
+4  A: 

python (at least standard CPython) is a special case, because it won't run more than one thread at a time, therefore if you are doing number-crunching on a multiple cores, then pure python isn't really the best choice.

In CPython, while running python code, only one thread is executing. It protected by the Global Interpreter Lock. If you're going IO or sleeping or waiting on the other hand, then python threads make sense.

If you are number-crunching then you probably want to do that in a C-extension anyway. Failing that the multiprocessing library provides a way for pure python code to take advantage of multiple cores.

In the general, non-python, case: the question can't be answered, because it depend on:

  1. Will running tasks on a new thread be faster at all>
  2. What is the cost of starting a new thread?
  3. What sort of work do the tasks contain? (IO-bound, CPU-bound, network-bound, user-bound)
  4. How efficient is the OS at scheduling threads?
  5. How much shared data/locking do the tasks need?
  6. What dependencies exist between tasks?

If your tasks are independent and CPU-bound, then running one per-CPU core is probably best - but in python you'll need multiple processes to take advantage.

Douglas Leeder