tags:

views:

50

answers:

2

I thought pthread uses clone to spawn one new thread in linux. But if so, all of the threads should have their seperate pid. Otherwise, if they have the same pid, the global variables in the libc seem to be shared. However, as I ran the following program, I got the same pid but the different address of errno.

extern errno;
void*
f(void *arg)
{
    printf("%u,%p\n", getpid(), &errno);
    fflush(stdin);
    return NULL;
}

int
main(int argc, char **argv)
{
    pthread_t tid;
    pthread_create(&tid, NULL, f, NULL);
    printf("%u,%p\n", getpid(), &errno);
    fflush(stdin);
    pthread_join(tid, NULL);
    return 0;
}

Then, why?

+1  A: 
  1. Global variables: your mistake is that errno is not a global variable but a macro that expands to an lvalue of type int. In practice, it expands to (*__errno_location()) or similar.
  2. getpid is a library function that returns the process id in the POSIX sense of process, not the bogus Linux per-clone pid. Nowadays Linux has the minimal kernel-level functionality necessary to make near-POSIX-compliance possible with respect to threads, but most of it still depends on ugly hacks at the userspace libc level.
R..
Thanks! You helped me out.
dutor
+2  A: 

I'm not sure exactly how clone() is used when pthread_create() is called. That said, looking at the clone() man page, it looks like there is a flag called CLONE_THREAD which:

If CLONE_THREAD is set, the child is placed in the same thread group as the calling process. To make the remainder of the discussion of CLONE_THREAD more readable, the term "thread" is used to refer to the processes within a thread group.

Thread groups were a feature added in Linux 2.4 to support the POSIX threads notion of a set of threads that share a single PID. Internally, this shared PID is the so-called thread group identifier (TGID) for the thread group. Since Linux 2.4, calls to getpid(2) return the TGID of the caller.

It then goes on to talk about a gettid() function for getting the unique ID of an individual thread within a process. Modifying your code:

#include <stdio.h>
#include <pthread.h>
#include <sys/types.h>
#include <sys/syscall.h>
#include <unistd.h>

int errno;
void*
f(void *arg)
{
    printf("%u,%p, %u\n", getpid(), &errno, syscall(SYS_gettid));
    fflush(stdin);
    return NULL;
}

int
main(int argc, char **argv)
{
    pthread_t tid;
    pthread_create(&tid, NULL, f, NULL);
    printf("%u,%p, %u\n", getpid(), &errno, syscall(SYS_gettid));
    fflush(stdin);
    pthread_join(tid, NULL);
    return 0;
}

(make sure to use "-lpthread"!) we can see that the individual thread id is indeed unique, while the pid remains the same.

rascher@coltrane:~$ ./a.out 
4109,0x804a034, 4109
4109,0x804a034, 4110
rascher
Thanks a lot for your help! But what about the errno that is *extern*ed from the glibc? It is unique, too.(While a regular global variable is shared)
dutor
This is correct - the kernel space "PID" is the user space "TID", and the the kernel space "TGID" is the user space "PID".
caf
@dutor: `errno` isn't a simple `extern` variable - it has special handling to make it per-thread (it is defined with a macro: `#define errno (*__errno_location ())`
caf