views:

221

answers:

3

I'm learning about POSIX threads right now, but I guess this is just a general question on multithreading, so I hope that anyone can help me. I have this example from the book that I am working on which demonstrates a race condition:

void *thread_main(void *thread_number) {
    printf("In thread number %d.\n", *(int *)thread_number);
}

void main() {
    int i = 0;
    pthread_t thread;

    for( i = 0; i < 10; i++ ) {
        printf("Creating thread %d.\n");
        pthread_create(&thread, 0, thread_main, &i);
        printf("Created thread %d.\n");
    }
}

There are a few things I don't understand about this. First, "In thread number 5." gets printed many times even though it's not supposed to be in thread number 5. In the book, the example shows thread 8 getting printed many times. I also don't understand the part that says *(int *)thread_number. I tried changing this to just thread_number, but that just gave me strange numbers over and over again.

The book doesn't really explain this. Can somebody give me a clear explanation of what's going on here? I don't understand why it doesn't print something like:

> Creating thread 1.
> In thread number 1.
> Created thread 1.
> Creating thread 2.
> In thread number 2.
> Created thread 2.

I know that because it's multithreaded the "In thread number x." part will come at different times, but I really don't get why there isn't exactly 10 "In thread number x" with one line for each thread I created!

~desi

+1  A: 

First, a race condition is a bad thing. This program should not work as you expect. A race condition means that the program is designed to break.

In this case it looks like all of your threads are sharing the variable i. You're passing them references to a single, shared variable, which they're attempting to report whenever they happen to get scheduled.

S.Lott
+3  A: 

It's possible that the for loop could iterate 10 time before any of the 10 threads that are created have a chance to run. In that case, the value of *thread_number would be 10 for each thread (since it's a pointer to a single memory location with a single value).

If you don't pass a pointer to i to pthread_create, then the value of the int winds up being treated as an address, so when you dereference it in thread_main, you're accessing some arbitrary memory location, the contents of which are probably undefined. You're lucky that you're not segfaulting in that case.

If you want to see the right value for *thread_number in each thread, you would want to malloc a new int before calling pthread_create and assign it the current value of i, like so:

for( i = 0; i < 10; i++ ) {
    int *thread_count = malloc(sizeof(int));
    *thread_count = i;
    printf("Creating thread %d.\n", i);
    pthread_create(&thread, 0, thread_main, thread_count);
    printf("Created thread %d.\n", i);
}

Of course, then you'd need to free the memory when the thread was done with it, like so:

void *thread_main(void *thread_number) {
    printf("In thread number %d.\n", *(int *)thread_number);
    free(thread_numbr);
}
Zach Hirsch
shank
+3  A: 

First off the *(int *)thread_number is pointer stuff - when you just had thread_number the 'strange numbers' were the pointer addresses to the 'i' from the main function (and unless I'm mistaken, they should all have been the same number for a single run of the program (so each of the 10 threads should have had the same "In thread number [number]")).

You need to understand that these were pointers for the repeated 5 to make sense - each thread was working with the same underlying i from the main function - it wasn't being copied for each new thread, so when the i was being incremented in the main function, that was reflected for the thread_number in the thread_main function.

The final piece of the puzzle is that the setup time for each new thread and then context switching (changing which thread is actually running) isn't immediate, so in your case the for loop runs 5 times before the newly created threads are actually run (and in the book's case, the for loop is run 8 times before the context switch), then each of the threads looks at the same underlying i value which is now 5.

Cebjyre
Thank you, this was very helpful.