ansaurus

Question

Problem using pthread to utilize multiple cores

Answer 1

+1 A:

I'd take a wild stab in the dark and say your worker threads are spending lots of time waiting on the condition variable. To get good CPU performance in this kind of situation where your code is mostly CPU bound, it is understood to use a task oriented style of programming, where you treat the threads as a "pool" and you use a queue structure to feed work to them. They should spend a very small amount of time pulling work off the queue and most of their time doing the actual work.

What you have right now is a situation where they are probably doing work for a while, then notifying the main thread via the semaphore that they are done. The main thread will not release them until both threads have finished working on the frame they are currently processing.

Since you are using C++, have you considered using Boost.Threads? It makes working with multithreaded code much easier, and the API is actually kind of similar to pthreads, but in a "modern C++" kind of way.

1800 INFORMATION 2009-04-07 09:24:32

Answer 2

+1 A:

I'm no pthreads guru, but it seems to me that the following code is wrong:

pthread_mutex_lock(&frame_mutex);
pthread_cond_wait(&frame_cond, &frame_mutex);
pthread_mutex_unlock(&frame_mutex);

To quote this article

pthread_cond_wait() blocks the calling thread until the specified condition is signalled. This routine should be called while mutex is locked, and it will automatically release the mutex while it waits. After signal is received and thread is awakened, mutex will be automatically locked for use by the thread. The programmer is then responsible for unlocking mutex when the thread is finished with it.

so it seems to me that you should be releasing the mutex after the block of code follwing the pthread_cond_wait.

anon 2009-04-07 09:31:35

Answer 3

+2 A:

This is useless :

pthread_mutex_lock(&frame_mutex);
pthread_cond_wait(&frame_cond, &frame_mutex);
pthread_mutex_unlock(&frame_mutex);

if you wait to wait for a new frame do something like :

int new_frame = 0;

First thread :

pthread_mutex_lock(&mutex); 
new_frame = 1; 
pthread_cond_signal(&cond);
pthread_mutex_unlock(&mutex);

other thread :

pthread_mutex_lock(&mutex); 
while(new_frame == 0)
  pthread_cond_wait(&cond, &mutex); 
/* Here new_frame != 0, do things with the frame*/
pthread_mutex_unlock(&mutex);

pthread_cond_wait(), actually release the mutex, and unschedule the thread until the condition is signaled. When the condition is signaled the thread is waken up and the mutex is re-taken. All this happen inside the pthread_cond_wait() function

Ben 2009-04-07 09:36:29

This did help, also I discovered that rendering every second line instead of half the image made the two threads render in almost the same time... So I did eventually manage to drive both core to 100%, but it didn't improve my frame rate :) - Or I'm just measuring it wrong... Thanks for the help...

jopsen 2009-04-09 09:52:32

Haha, the first "optimization" step is always try to make the parallel algorithm as efficient with n processors, than the sequential was with a single processor. Keep trying, you will eventually get an improvement

Ben 2009-04-09 10:06:54

ansaurus

tags:

views:

answers:

Problem using pthread to utilize multiple cores

related questions