views:

173

answers:

4

Hi, I have couple of questions on threads. Could you please clarify.

  1. Suppose process with one or multiple threads. If the process is prempted/suspended, does the threads also get preempted or does the threads continue to run?

  2. When the suspended process rescheduled, does the process threads also gets scheduled? If the process has process has multiple threads, which threads will be rescheduled and on what basis?

  3. if the thread in the process is running and recieves a signal(say Cntrl-C) and the default action of the signal is to terminate a process, does the running thread terminates or the parent process will also terminate? What happens to the threads if the running process terminates because of some signal?

  4. If the thread does fork fallowed exec, does the exece'd program overlays the address space of parent process or the running thread? If it overlays the parent process what happens to threads, their data, locks they are holding and how they get scheduled once the exec'd process terminates.

  5. Suppose process has multiple threads, how does the threads get scheduled. If one of the thread blocks on some I/O, how other threads gets scheduled. Does the threads scheduled with the parent process is running?

  6. While the thread is running what the current kernel variable points(parent process task_stuct or threads stack_struct?

  7. If the process with the thread is running, when the thread starts does the parent process gets preempted and how each threads gets scheduled?

  8. If the process running on CPU creates multiple threads, does the threads created by the parent process schedule on another CPU on multiprocessor system?

Thanks, Ganesh

A: 

Assuming your questions are about POSIX threads, then

1a. A process that's preempted by the O/S will have all its threads preempted.

1b. The O/S will suspend all the threads of a process that is sent a SIGSTOP.

  1. The O/S will resume all thread of a suspended process that is sent a SIGCONT.

  2. By default, a SIGINT will terminate all the threads in a process.

  3. If a thread calls fork(), then all its threads are duplicated. If it then call one of the exec() functions, then all the duplicated threads disappear.

  4. POSIX allows for user-selection of the thread scheduling algorithm.

  5. I don't understand the question.

  6. I don't understand the question.

  7. How threads are mapped to CPU-s is implementation-dependent. Many implementations will try to distribute threads amongst the available CPU-s to improve performance.

Steve Emmerson
5. suppose process create 2 threads, thd1 and thd2. If thd1 sleeps on some I/O like reading from the disk or waiting for the mutex lock to be free, then how the scheduler selects the other thread? Does the parent process's which created the threads time quantum shared among the threads?
Ganesh Kundapur
In kernel, current always points to the currently running process like current = currently running task_struct. On thread is running what it points to?
Ganesh Kundapur
If a process calls `fork()`, only the *calling thread* is duplicated. [POSIX `fork()`](http://www.opengroup.org/onlinepubs/000095399/functions/fork.html) says: *A process shall be created with a single thread.*
caf
@caf: Quite right. My mistake.
Steve Emmerson
+1  A: 
  1. Depends. If a thread is preempted because the OS scheduler decides to give CPU time to some other thread, then other threads in the process will continue running. If the process is suspended (i.e. it gets the SIGSTP signal) then AFAIK all the threads will be suspended.

  2. When a suspended process is woken up, all the threads are marked as waiting or blocked (if they are waiting e.g. on a mutex). Then the scheduler at some points run them. There is no guarantee about any specific order the threads are run after waking up the process.

  3. The process will terminate, and with it the threads as well.

  4. When you fork you get a new address space, so there is no "overlay". Note that fork() and the exec() family affect the entire process, not only the thread from which they where called. When you call fork() in a multi-threaded process, the child gets a copy of that process, but with only the calling thread. Then if you call exec() in one or both of the processes (presumably only in the child process, but that's up to you), then the process which calls exec() (and with it, all its threads) is replaced by the exec()'ed program.

  5. The thread scheduling order is decided by the OS scheduler, there is no guarantee given about any particular order.

  6. From the kernel perspective a process is an address space with one or more threads (and some other gunk). There is no concept of threads that somehow exist without a process.

  7. There is no such thing as a process without a single thread. A "plain process" is just a process with a single thread.

  8. Probably yes. This is determined by the OS scheduler. Note that there are API's and tools (numactl) that one can use to force some thread(s) to run on a specific CPU core.

janneb
4. On process exec'ing another process, current process address space is overwritten by the exec'd process.
Ganesh Kundapur
4. What happens when either process or its threads exec's? Does only parent process address space will get overwritten by the exec'd what happens to the threads?
Ganesh Kundapur
Process having multiple threads preempted due to sleep or its time quantum expires. Before preemption thread is running. When the process resumes execution does it continue with the thread or the process. How and where it remembers the threads context(PC value/context) so that thread will continue from where it is left before preemption? During context switch only process context is stored not the threads context(user context, hardware register etc). When the thread resumes execution from where its context is restored?
Ganesh Kundapur
I added some more stuff to the answer for #4 to make clear that fork and exec affect the entire process. Hopefully that clears things up for you.
janneb
Wrt your third comment, as mentioned in the answers to #6 and #7, processes and threads don't exist independently of each other. Also, the OS schedules threads, not entire processes. When it switches context, it stores the registers and PC value in a per-thread data structure in the kernel.
janneb
As you mentioned OS schedules threads not the processes but threads don't have their own address space. It shares the process address space which created the thread. How the thread will get scheduled without the process?
Ganesh Kundapur
The OS schedules threads not processes nor address spaces. When at least one of the threads of a process is running, the process itself is considered to be running (even though other threads of the same process might not be running at that point in time). A thread cannot get scheduled "without the process", because without a process the thread wouldn't exist.
janneb
#4 has some incorrect information. `fork()` does *not* duplicate all the threads in the child process - only the thread that called `fork()` is duplicated. Calling `fork()` after you have created additional threads is usually only useful if the child is to call `execve()`.
caf
@caf. Right, fixed.
janneb
A: 

The Linux kernel doesn't distinguish between threads and processes. As far as kernel is concerned, a thread is simply another process which happens to share address space with other processes. (You would call the set of "processes" (i.e. threads) which share a single address space a "process".)

So POSIX threads are scheduled exactly as full-blown processes would be. There is no difference in scheduling whether you have one process with five threads, or five separate processes.

There are kernel calls that provide fine grained control over what is shared between processes. The POSIX threads API wraps over them.

slacker
+1  A: 

First, I should clear up some terminology that you appear to be confused about. In POSIX, a "process" is a single address space plus at least one thread of control, identified by a process ID (PID). A thread is an individually-scheduled execution context within a process.

All processes start life with just one thread, and all processes have at least one thread. Now, onto the questions:

  1. Suppose process with one or multiple threads. If the process is prempted/suspended, does the threads also get preempted or does the threads continue to run?

Threads are scheduled independently. If a thread blocks on a function like connect(), then other threads within the process can still be scheduled.

It is also possible to request that every thread in a process be suspended, for example by sending SIGSTOP to the process.

  1. When the suspended process rescheduled, does the process threads also gets scheduled? If the process has process has multiple threads, which threads will be rescheduled and on what basis?

This only makes sense in the context that an explicit request was made to stop the entire process. If you send the process SIGCONT to restart the process, then any of the threads which are not blocked can run. If more threads are runnable than there are processors available to run them, then it is unspecified which one(s) run first.

  1. If the thread in the process is running and recieves a signal(say Cntrl-C) and the default action of the signal is to terminate a process, does the running thread terminates or the parent process will also terminate? What happens to the threads if the running process terminates because of some signal?

If a thread recieves a signal like SIGINT or SIGSEGV whose action is to terminate the process, then the entire process is terminated. This means that every thread in the process is unceremoniously killed.

  1. If the thread does fork followed by exec, does the exece'd program overlays the address space of parent process or the running thread? If it overlays the parent process what happens to threads, their data, locks they are holding and how they get scheduled once the exec'd process terminates.

The fork() call creates a new process by duplicating the address space of the original process, and duplicating just the single thread that called fork() within that new address space.

If that thread in the new process calls execve(), it will replace the new, duplicated address space with the exec'd program. The original process, and all its threads, continue running normally.

  1. Suppose process has multiple threads, how does the threads get scheduled. If one of the thread blocks on some I/O, how other threads gets scheduled. Does the threads scheduled with the parent process is running?

The threads are scheduled independently. Any of the threads that are not blocked can run.

  1. While the thread is running what the current kernel variable points(parent process task_stuct or threads stack_struct?

Each thread has its own task_struct within the kernel. What userspace calls a "thread" is called a "process" in kernel space. Thus current always points at the task_struct corresponding to the currently executing thread (in the userspace sense of the word).

  1. If the process with [a second] thread is running, when the thread starts does the parent process gets preempted and how each threads gets scheduled?

Presumably you mean "the process's main thread" rather than "parent process" here. As before, the threads are scheduled independently. It's unspecified whether one runs before the other - and if you have multiple CPUs, both might run simultaneously.

  1. If the process running on CPU creates multiple threads, does the threads created by the parent process schedule on another CPU on multiprocessor system?

That's really up to the kernel, but the threads are certainly allowed to execute on other CPUs.

caf
I was looking at the GNU pth implementation wherein pthread_create creates one scheduler thread and same scheduler thread is used in successive pthread_create and maintains a queue of threads. Does this user level scheduler thread is responsible scheduling a user level thread. If this is really responsible,(hoping that on every thread exit, calls this scheduler thread), what is role of kernel scheduler. While the thread is running, the process that created the thread sleeps/preempted?
Ganesh Kundapur
@Ganesh: GNU pth is a user-space cooperative (=non-preemptive) threading library. Thus the kernel sees a pth-using program as a single thread. This is fairly different from NPTL which is what one normally uses when using POSIX threads on Linux.
janneb
thanks. Still i have some more doubts. As the documentation says threads are independently schedulable, does this means thread will run or get scheduled even though process that created the thread is not running?
Ganesh Kundapur
@Ganesh: Your question makes no sense. I suggest you reread the answers by me, caf, and slacker where we explain what is a process, what is a thread, and what is the relationship between them. Until you understand those fundamental concepts, further discussions about various details are unlikely to be fruitful.
janneb
@Ganesh: Processes don't get scheduled. Threads do.
TomMD