ansaurus

Question

Does this multithreaded program perform better than the non-multithreaded one?

Answer 1

+6 A:

This would depend on how many CPUs you have. With a single CPU core, a computation-bound program will never run faster with multiple threads.

Moreover, since you're doing all the work with the lock held, you'll end up with only a single thread running at any time, so it's effectively single threaded anyway.

bdonlan 2009-05-17 20:42:38

you printing faster than me=(I agree.

Imaskar 2009-05-17 20:49:14

55% of the accepted answer is this one and the other 45% is ebo's. :)

2009-05-17 21:04:59

Answer 2

A:

i would try to see how much do those calculations take. In case it's a very small fraction of time then i would probably gone for a single process model since spawning a thread for each calculation involves some overhead by it self.

Konstantinos 2009-05-17 20:42:54

Answer 3

+1 A:

As your code is serialised by a mutex in the actual calculation, it will be slower than a non-threaded version. Of course, you could easily have tested this for yourself.

anon 2009-05-17 20:43:02

How? This is one of my questions.

2009-05-17 20:44:50

Create a version without all the threading code and compare performance.

anon 2009-05-17 20:46:21

Answer 4

+9 A:

You are acquiring the Mutex before the calculations. You should do that immediately before summing to local values.

pthread_mutex_lock(&mtx);
total_sum += local_sum;
pthread_mutex_unlock(&mtx);

ebo 2009-05-17 20:49:09

Nice catch, thanks!

2009-05-17 21:03:26

If you want bonus points, there are atomic instructions you can use to avoid the lock (they use less processing time) - see http://stackoverflow.com/questions/680097/ive-heard-i-isnt-thread-safe-is-i-thread-safe/680114

Tom Leys 2009-05-17 22:51:26

Tom Leys 2009-05-18 04:16:02

Answer 5

A:

to compare performance just remember system time at program start, call it from n=1000 and see system time at the end. compare to non-threaded program result. as bdonlan said, non-threaded will run faster

Imaskar 2009-05-17 20:51:24

1000 will be far to low to get any reasonable results, even if the program would be right.

ebo 2009-05-17 20:52:56

oh no, n=1000 will overflow integer... n=50

Imaskar 2009-05-17 20:53:16

That may be another problem. I was considering the performance viewpoint.

ebo 2009-05-17 20:54:59

yeah, low n brings poor statistics, but high n ovrflows integer. it's better to use at least long type and find biggest working 'n' and run test ~100 times. I'm not sure about caching, it may pollute results =(

Imaskar 2009-05-17 20:56:28

Answer 6

A:

1) Single threaded would probably perform a bit better than this, because all calculations are done within a lock and the overhead of locking will add to the total time. You are better off only locking when adding the local sums to the total sum, or storing the local sums in an array and calculating the total sum in the main thread.

2) Use timing statements in your code to measure elapsed time during the algoritm. In the multithreaded case, only measure elapsed time on the main thread.

3) Derived from your code:

int i, total_sum = 0;
for (i = 0; i < n; i++)
  total_sum += SQR(i + 1);

Renze de Waal 2009-05-17 20:57:06

3 is a bad idea because then all threads are constantly writing to a shared value. Much better to only write once at the end of the thread once Local_sum is calculated.

Tom Leys 2009-05-17 22:47:39

To elaborate slightly, every time a processor writes to a variable, all other processors have to remove that variable from their local cache and re-read it later. This is very very expensive

Tom Leys 2009-05-17 22:48:16

@Tom: 3 answers the third question: what would be the program without using threads. Multithreading is not an issue in 3.

Renze de Waal 2009-05-18 18:20:38

Answer 7

A:

A much larger consideration comes to scheduling. The easiest way for kernel-side threading to be implemented is for each thread to get equal time regardless. Processes are just threads with their own memory space. IF all threads get equal time, adding a thread takes you from 1/n of the time to 2/(n + 1) of the time, which is obviously better given > 0 other threads that aren't you.

Actual implementations may and do vary wildly though.

Alex Gartrell 2009-05-17 22:29:40

Answer 8

+3 A:

Don't bother with threading etc. In fact, don't do any additions in a loop at all. Just use this formula:

∑(r = 1; n) r^2 = 1/6 * n (n + 1)(2 n + 1) [1]

[1]http://thesaurus.maths.org/mmkb/entry.html?action=entryById&id=1539

Ben Schwehn 2009-05-18 01:14:18

Answer 9

A:

Off-topic a bit, but maybe avoid the mutex by having each thread write it's result into an array element (so assign "results = calloc(sizeof(int), p)" (btw "p" is an awful name for the variable holding the number of threads) and results[thr] = local_sum), and have the joining thread (well, main()) do the summing of the results. So each thread is responsible for just calculating its total: only main(), which orchestrates the threads, joins their data together. Separation of concerns.

For extra credit (:p), use the arg passed to do_calc() as a way to pass the thread ID and the location to write the result to rather than relying on a global array.

araqnid 2009-05-18 01:31:43

ansaurus

tags:

views:

answers:

Does this multithreaded program perform better than the non-multithreaded one?

related questions