views:

323

answers:

3

Thread is "lightweight" because most of the overhead has already been accomplished through the creation of its process. I found this in one of the tutorial.

Can somebody eloberate what it exactly means?

+6  A: 

Process creation is "expensive", because it has to set up a complete new virtual memory space for the process with it's own address space. "expensive" means takes a lot of CPU time.

Threads don't need to do this, just change a few pointers around, so it's much "cheaper" than creating a process. The reason threads don't need this is because they run in the address space, and virtual memory of the parent process.

Every process must have at least one thread. So if you think about it, creating a process means creating the process AND creating a thread. Obviously, creating only a thread will take less time and work by the computer.

In addition, threads are "lightweight" because threads can interact without the need of inter-process communication. Switching between threads is "cheaper" than switching between processes (again, just moving some pointers around). And inter-process communication requires more expensive communication than threads.

Mystere Man
The answer here is basically the question. Processes are heavy and threads are not. http://tinyurl.com/ycg5h7b
Matt Ball
+6  A: 

Threads within a process share the same virtual memory space but each has a separate stack, and possibly "thread-local storage" if implemented. They are lightweight because a context switch is simply a case of switching the stack pointer and program counter and restoring other registers, wheras a process context switch involves switching the MMU context as well.

Moreover, communication between threads within a process is lightweight because they share an address space.

Clifford
It's not lightweight if you have a mulit-core system that needs to synchronise data between the L1 caches for mutliple threads - in fact that's anything but lightweight
zebrabox
@zebrabox: No doubt true. As ever such terms become irrelevant over time. Most of my work is on single core embedded systems with RTOS kernels, so such complexities have seldom occurred - everything's a thread - even when protected by an MMU (contradicting what I said earlier!).
Clifford
@Clifford. Yep I know where you're coming from. It's obviously much less of an issue on a single core machine - especially if you have access to a couple of hardware threads :)
zebrabox
+7  A: 

The claim that threads are "lightweight" is - depending on the platform - not necessarily reliable.

An operating system thread has to support the execution of native code, e.g. written in C. So it has to provide a decent-sized stack, usually measured in megabytes. So if you started 1000 threads (perhaps in an attempt to support 1000 simultaneous connections to your server) you would have a memory requirement of 1 GB in your process before you even start to do any real work.

This is a real problem in highly scalable servers, so they don't use threads as if they were lightweight at all. They treat them as heavyweight resources. They might instead create a limited number of threads in a pool, and let them take work items from a queue.

As this means that the threads are long-lived and small in number, it might be better to use processes instead. That way you get address space isolation and there isn't really an issue with running out of resources.

In summary: be wary of "marketing" claims made on behalf of threads. Parallel processing is great (increasingly it's going to be essential), but threads are only one way of achieving it.

Daniel Earwicker
+1 Spot on - couldn't have put it better myself
zebrabox
I love the summary. I tend to prefer using multiple single-threaded processes, and I *hate* it when multi-threaded fan-boyers accuse me of failing to take advantage of easy multi-core parallelism because they don't grok the idea of multiple processes.
William Pursell
Thanks. There are many reasons why process can sometimes be better than threads. One that surprised me was that you might get a free performance boost! The reason is that separate processes have separate C memory heaps. In a thread-safe standard library, all the `malloc`/`free` calls have to be synchronised, which could mean a ton of locking depending on your use of memory. I've seen an app using threads stall at 4 cores, but when switched to separate processes it would scale to 8 (and probably more, didn't have more cores to test on). The less memory you share, the less you have to sync.
Daniel Earwicker
@earwicker - indeed. I parallelised a very data intensive task not by multi-threading but by kicking off multiple processes each acting on a subset of non-dependant data.
zebrabox
The term is called "light weight process", not merely "light weight". The point is that threads are lighter in resource usage than a process is, not that threads are "light weight" in and of themselvess. I guarantee you, on every platform that supports real threads, a thread will use fewer resources than a process. Every time, because every process requires at least one thread. Thus 1000 threads will use less resources than 1000 processes.
Mystere Man
@Mystere Man - It's true, N threads will use less memory, etc. than N processes, but there are other kinds of resource, such as wall (real) time. A multi-process design may spend less time waiting in locks to use the C heap, and so may run faster.
Daniel Earwicker
@Earwicker - Perhaps, but then IPC between processes will eat up more resources as well. It all depends on the problem being solved.
Mystere Man
Exactly - this is why my answer is simply that neither is absolutely more lightweight than the other in all circumstances. It depends.
Daniel Earwicker