views:

91

answers:

1

Hello, I'm writing a simple threadpool for some small jobs (100 to 700 microseconds). I'm working only with two threads (because there are only two jobs and the processor has only two cores). My problem is that most of the time both jobs are executed by the same thread. The problem does not occur with bigger jobs (some milliseconds).

The expected behaviour would be (in this case the speedup is as expected:

  • Thread 1 after cond_wait
  • Job executed by: 1
  • Thread 0 after cond_wait
  • Job executed by: 0
  • Thread 1 before cond_wait
  • Thread 0 before cond_wait

But sometimes (50 %) (other thread blocked before mutex in cond not notified?):

  • Thread 1 after cond_wait
  • Job executed by: 1
  • Job executed by: 1
  • Thread 0 after cond_wait
  • Thread 0 before cond_wait
  • Thread 1 before cond_wait

Or even worse (signal for the other thread lost?):

  • Thread 0 after cond_wait
  • Job executed by: 0
  • Job executed by: 0
  • Thread 0 before cond_wait

This is the main loop executed by both threads (created with pthread_create):

pthread_mutex_lock(&pl->mutex);
for (;;) {
    /* wait on notification that a new job is available */
    while (pl->queue_head==NULL) {
        //printf("Thread %d before cond_wait\n",threadID);
        pthread_cond_wait(&pl->workcv, &pl->mutex);
        //printf("Thread %d after cond_wait\n",threadID);
    }
    /* get first job */
    job=pl->queue_head;
    if (job!=NULL) {
        /* remove job from the queue */
        pl->queue_head=job->next;
        if (job==pl->queue_tail){ 
            pl->queue_tail=NULL; 
        }
        pthread_mutex_unlock(&pl->mutex);
        /* get job parameter */
        func=job->func;
        arg=job->arg;
        /* Execute job */
        //printf("Job executed by: %d\n",threadID);
        func(arg, threadID);
        /* acquire lock */
        pthread_mutex_lock(&pl->mutex);
    }
}

Before the submission of the jobs both threads wait int the while loop at the workcv condition. The jobs are submitted by the following lines of code (in both code snippets I removed the code that is used to wait on the completion of both jobs):

pthread_mutex_lock(&pl->mutex);
/* Append job to queue */
if (pl->queue_head==NULL) {
    pl->queue_head=job[numJobs-1];
}else {
    pl->queue_tail->next=job[numJobs-1];
}
pl->queue_tail=job[0];
/* Wake up thread if one is idle */
pthread_cond_broadcast(&pl->workcv);
pthread_mutex_unlock(&pl->mutex);

Default attributes for mutex, threads and conditions are used. Environment: Gcc 4.2.1, Mac OSX Snow Leopard

What I'm making wrong?

Thanks!

+1  A: 

Instead of pthread_cond_broadcast(), you should use pthread_cond_signal(). Additionally, after a thread has taken a job, it should signal the condition variable:

    /* remove job from the queue */
    pl->queue_head=job->next;
    if (job==pl->queue_tail){ 
        pl->queue_tail=NULL; 
    }
    if (pl->queue_head != NULL)
        pthread_cond_signal(&pl->workcv);
    pthread_mutex_unlock(&pl->mutex);
caf
Thanks. I did the modifcations but the problem is still the same. It seems that the thread blocked at the mutex inside the condition is not informed fast enough when the mutex is released by the other thread.
Lenz
@Lenz: Well, yes - the other process might have to be migrated across cores, for one thing. If your jobs are very fast, then you will simply need to keep your queue supplied with more than 2 waiting jobs at a time!
caf