views:

59

answers:

3

How do you figure out whether it's worth parallelizing a particular code block based on its code size? Is the following calculation correct?

Assume:

  • Thread pool consisting of one thread per CPU.
  • CPU-bound code block with execution time of X milliseconds.
  • Y = min(number of CPUs, number of concurrent requests)

Therefore:

  • Cost: code complexity, potential bugs
  • Benefit: (X * Y) milliseconds

My conclusion is that it isn't worth parallelizing for small values of X or Y, where "small" depends on how responsive your requests must be.

+4  A: 

One thing that will help you figure that out is Amdahl's Law

The speedup of a program using multiple processors in parallel computing is limited by the time needed for the sequential fraction of the program. For example, if a program needs 20 hours using a single processor core, and a particular portion of 1 hour cannot be parallelized, while the remaining promising portion of 19 hours (95%) can be parallelized, then regardless of how many processors we devote to a parallelized execution of this program, the minimal execution time cannot be less than that critical 1 hour. Hence the speed up is limited up to 20x.

Figure out what you want to achieve in speed up, and how much parallelism you can actually achieve, then see if its worth it.

yx
+1  A: 

It depends on many factors, as the difficulty of parallelize the code, the speedup obtained from it (there are overhead costs on dividing the problem and joining the results) and the amount of time that the code is spending there (Amdahl's Law)

Artur Soler
A: 

Well, the benefit is really more:

(X * (Y-1)) * Tc * Pf

Where Tc is the cost of the threading framework you are using. No threading framework scales perfectly, so using 2x threads will likely be, at best, 1.9x speed.

Pf is some factor for parallization that depends completely on the algorithm (ie: whether or not you'll need to lock, which will slow the process down).

Also, it's Y-1, since single threaded is basically assuming Y==1.

As for deciding, it's also a matter of user frustration/expectation (if they user is annoyed at waiting for something, it'd have a greater benefit than a task that the user doesn't really mind - which is not always just due to wait times, etc - it's partly expectations).

Reed Copsey

related questions