I'm a little new to threading, so you'll have to forgive the naiveté of this question.
How is pthread_join implemented and how does it effect thread scheduling? 
I always pictured pthread_join implemented with a while loop, simply causing the calling thread to yield until the target thread completes. Like this (very approximate pseudocode):
atomic bool done;
thread_run {
    do_stuff();
    done = true;
}
thread_join {
    while(!done) {
        thread_yield();
    //  basically, make the thread that calls "join" on
    //  our thread yield until our thread completes
    }
}
Is this an accurate depiction, or am I vastly oversimplifying the process?
Cheers!