If you look at the Processes tab in Task Manager on your Windows machine, you will see the processes that are currently active on the machine. If you add the Threads column to the view, you will see the number of threads that currently exist in each process. The operating system (OS) is the one that determines how all of these threads across all of these processes are scheduled for execution on the processor. So in effect, the OS is constantly determining which threads have work to do and scheduling those threads for execution on the processor.
Let's assume a single processor, single core machine for now.
In this example, your application is the only process that is doing anything. Say your application has two threads of equal priority (more on this below). In this case, the OS will alternate between these two threads, scheduling one for execution and then the other until the work that they are doing is complete. To accomplish this, the OS grants a timeslice to the first scheduled thread. For example purposes, let's say the timeslice is 10 milliseconds (it's actually much shorter than this). So thread A will execute for 10 milliseconds. The OS will then preempt thread A so thread B can execute for its timeslice, also 10 milliseconds.
This back-and-forth will continue uninterrupted until both threads have finished their work or until certain events occur. For example, let's say that thread A finishes its work before thread B. In this case, thread A has nothing else to so, so the OS will continue to grant timeslices to thread B since it is the only one with work to do. Another thing that can happen is that thread A can wait on an event, such as a System.Threading.ManualResetEvent
, or an asynchronous read of a socket. Until that event is signaled or data is received on the socket, thread A is essentially dead in its tracks, so the OS will continue to grant timeslices to thread B until the event/socket that thread A is waiting on occurs. At that point, the OS will resume switching between thread A and thread B for execution.
A good example of this is the background printing that most applications do today. An application's main thread is dedicated to processing UI events - button clicks, keyboard presses, drag-and-drop, etc. If you print a document from your favorite word processor, what happens conceptually is that the task of sending the print instructions to the printer is delegated to a secondary thread. So at this point, your application has two threads that are running - one thread servicing the UI and the other thread handling the print job. Since this is on a single processor, single core machine, the OS swaps between the two threads, granting timeslices to each. In this case, the print job thread will end after it finishes sending the print instructions, and then only your UI thread will be left.
A question you may have at this point is this:
Doesn't it take longer to print this
way on a single processor, single core machine
since the OS is having to swap between
the print job thread and the UI
thread?
And the answer is YES. It does take longer this way. But consider the alternative. If the print job were executed on the UI thread, the user interface would be unresponsive to your input, i.e., button clicks, keyboard presses, etc., until the print job was complete. And this would frustrate you as the user because the application isn't responding to your input. So, in effect, multithreading is really an illusion of parallelism, at least on a single processor, single core machine. However, you get the satisfaction of being able to interact with your application while the print job is accomplished on another thread, even though the print job takes longer doing it this way.
Now let's move to a multicore machine. If your process has the same two threads, A and B, to execute, then each thread can be scheduled on a separate core. In this case, both threads run simultaneously without the interruption. The OS doesn't have to swap between the threads because each thread has its own core to run on. Make sense?
Finally, let's consider the priority associated with threads (assume single processor, single core again). Each thread in a given application has, by default, the same priority. What this means is that the OS will consider all threads equal with regard to scheduling. If you have two threads to be executed, they will get roughly the same amount of time on the processor. You can adjust this, however, by increasing/decreasing the priority of one thread over the other. In this case, the thread with the higher priority is favored for scheduling purposes over the thread with a lower priority, meaning that it gets more timeslices than the other thread. In some limited cases, adjusting the priority of threads can improve your application's performance, but for most applications, it is not necessary. The thing to be cautious of is to not "starve" a thread, especially the UI thread. The OS helps to prevent this by not starving a thread altogether. Still, adjusting the priorities can still make your application appear sluggish, if not altogether unresponsive, if the UI thread is "put on a diet," so to speak.
You can read more about thread priorities here and here.
I hope this helps.