views:

119

answers:

1

Fault address occurred when i malloc pthread_t to save a newly created thread id and free it in another thread. Codes as follows:

typedef struct _TaskInfo { 
    // int dummy_int;
    pthread_t tid;
} TaskInfo;

void* dummy_task(void* pArg) {
    free(pArg);
    return NULL;
}

void create_task() {
    TaskInfo *pInfo;
    pthread_attr_t attr;

    // set detached state stuff ...

    pInfo = (TaskInfo*) malloc(sizeof(TaskInfo));
    pthread_create(&pInfo->tid, &attr, dummy_task, pInfo);

    // destroy pthread attribute stuff ...
}

int main() {
    int i;
    while(i < 10000) {
        create_task();
        ++i;
    }
    return 0;
}

When I uncomment the member dummy_int of TaskInfo it sometimes ran successfully, but sometimes failed. My platform is VMWare + Ubuntu 9.10 + ndk r3

Thanks!

+1  A: 

pthread_create() stores the thread ID (TID) of the created thread in the location pointed to by the first parameter, however it does that after the thread is created (http://opengroup.org/onlinepubs/007908799/xsh/pthread_create.html):

Upon successful completion, pthread_create() stores the ID of the created thread in the location referenced by thread

Since the thread has already been created, it may well get a chance to run and delete that block of memory before pthread_create() gets a chance to store the TID in it.

When you don't have the dummy_int member in the struct you're probably corrupting the heap in a way that crashes early. With the dummy_int member included, you happen to be trashing something less sensitive (so the crashes are a bit less frequent). In either case, you're trashing memory that isn't allocated (or might not be allocated - you have a race condition).

Michael Burr
Thanks very much! But I'm still wondering why the same codes on Linux just worked fine. Because I read from Robert Love's "Linux Kernel Development" that the kernel runs the child process first and the implementation of process and thread on Linux are the same. Does the thread scheduling on Linux differ from android?
scleung
I couldn't say - but in either case, since you're dealing with a race condition, you're dealing with code that's going to fail at some point. It may be that the failure is more 'certain' on Android because you're likely dealing with a single processor device, while on Linux you might be running a multiproc device so the original thread has a chance to do it's save of the TID before the `free()` occurs in the newly created thread. Or Linux may have enough of difference in the scheduler to let you get away with it. But it's still bug - you're just not noticing it in some cases.
Michael Burr