ansaurus

Question

Answer 1

+6 A:

You need to break the problem down to match the number of processors or threads you have. In your case you could have for example one processor working on the even terms and another working on the odd terms. Instead of precalculating x^2 and using lastX*(x^2), you use lastX*(x^4) to skip every other term. To use 8 processors, multiply the previous term by x^16 to skip 8 terms.

P.S. Most of the time when presented with a problem like this, it's worthwhile to look for a more efficient way of calculating the result. Better algorithms beat more horsepower most of the time.

Mark Ransom 2010-10-07 22:25:05

Answer 2

+1 A:

Well, for this example, you might sum the series (if I've got the brackets in the right places):

(-1)^i * (x^(2i + 1))/(2i + 1)

Then on processor 1 of 8 compute the sum of the terms for i = 1, 9, 17, 25, ...

Then on processor 2 of 8 compute the sum of the terms for i = 2, 11, 18, 26, ...

and so on, finally adding up the partial sums.

Or, you could do as you (nearly) suggest, give i = 1..16 (say) to processor 1, i = 17..32 to processor 2 and so on, and they can compute each successive power of x from the previous one. If you want more than 8x16 elements in the series, then assign more to each processor in the first place.

I doubt whether, for this example, it is worth parallelising at all, I suspect that you will get to double-precision accuracy on 1 processor while the parallel threads are still waking up; but that's just a guess for this example, and you can probably many series for which parallelisation is worth the effort.

And, as @Mark Ransom has already said, a better algorithm ought to beat brute-force and a lot of processors every time.

High Performance Mark 2010-10-07 22:39:42

Answer 3

+2 A:

If you're trying to calculate the value of pi to millions of places or something, you first want to pay close attention to choosing a series that converges quickly, and which is amenable to parallellization. Then, if you have enough digits, it will eventually become cost-effective to split them across multiple processors; you will have to find or write a bignum library that can do this.

Note that you can factor out the variables in various ways; e.g.:

atan(x)= x - x^3/3 + x^5/5 - x^7/7 + x^9/9 ...
       = x*(1 - x^2*(1/3 - x^2*(1/5 - x^2*(1/7 - x^2*(1/9 ...

Although the second line is more efficient than a naive implementation of the first line, the latter calculation still has a linear chain of dependencies from beginning to end. You can improve your parallellism by combining terms in pairs:

       = x*(1-x^2/3) + x^3*(1/5-x^2/7) + x^5*(1/9 ...
       = x*( (1-x^2/3) + x^2*((1/5-x^2/7) + x^2*(1/9 ...
       = [yet more recursive computation...]

However, this speedup is not as simple as you might think, since the time taken by each computation depends on the precision needed to hold it. In designing your algorithm, you need to take this into account; also, your algebra is intimately involved; i.e., for the above case, you'll get infinitely repeating fractions if you do regular divisions by your constant numbers, so you need to figure some way to deal with that, one way or another.

comingstorm 2010-10-07 23:14:15

ansaurus

tags:

views:

answers:

parallel calculation of infinite series

related questions