Why are threads called as light weight processes

Process creation is "expensive", because it has to set up a complete new virtual memory space for the process with it's own address space. "expensive" means takes a lot of CPU time.

Threads don't need to do this, just change a few pointers around, so it's much "cheaper" than creating a process. The reason threads don't need this is because they run in the address space, and virtual memory of the parent process.

Every process must have at least one thread. So if you think about it, creating a process means creating the process AND creating a thread. Obviously, creating only a thread will take less time and work by the computer.

In addition, threads are "lightweight" because threads can interact without the need of inter-process communication. Switching between threads is "cheaper" than switching between processes (again, just moving some pointers around). And inter-process communication requires more expensive communication than threads.

The claim that threads are "lightweight" is - depending on the platform - not necessarily reliable.

An operating system thread has to support the execution of native code, e.g. written in C. So it has to provide a decent-sized stack, usually measured in megabytes. So if you started 1000 threads (perhaps in an attempt to support 1000 simultaneous connections to your server) you would have a memory requirement of 1 GB in your process before you even start to do any real work.

This is a real problem in highly scalable servers, so they don't use threads as if they were lightweight at all. They treat them as heavyweight resources. They might instead create a limited number of threads in a pool, and let them take work items from a queue.

As this means that the threads are long-lived and small in number, it might be better to use processes instead. That way you get address space isolation and there isn't really an issue with running out of resources.

In summary: be wary of "marketing" claims made on behalf of threads. Parallel processing is great (increasingly it's going to be essential), but threads are only one way of achieving it.

It's not lightweight if you have a mulit-core system that needs to synchronise data between the L1 caches for mutliple threads - in fact that's anything but lightweight

zebrabox 2010-02-15 22:19:56

@zebrabox: No doubt true. As ever such terms become irrelevant over time. Most of my work is on single core embedded systems with RTOS kernels, so such complexities have seldom occurred - everything's a thread - even when protected by an MMU (contradicting what I said earlier!).

Clifford 2010-02-16 09:22:18

@Clifford. Yep I know where you're coming from. It's obviously much less of an issue on a single core machine - especially if you have access to a couple of hardware threads :)

zebrabox 2010-02-16 13:46:26

+1 Spot on - couldn't have put it better myself

zebrabox 2010-02-15 17:49:47

I love the summary. I tend to prefer using multiple single-threaded processes, and I *hate* it when multi-threaded fan-boyers accuse me of failing to take advantage of easy multi-core parallelism because they don't grok the idea of multiple processes.

William Pursell 2010-02-15 18:43:37

Thanks. There are many reasons why process can sometimes be better than threads. One that surprised me was that you might get a free performance boost! The reason is that separate processes have separate C memory heaps. In a thread-safe standard library, all the `malloc`/`free` calls have to be synchronised, which could mean a ton of locking depending on your use of memory. I've seen an app using threads stall at 4 cores, but when switched to separate processes it would scale to 8 (and probably more, didn't have more cores to test on). The less memory you share, the less you have to sync.

Daniel Earwicker 2010-02-15 20:45:09

@earwicker - indeed. I parallelised a very data intensive task not by multi-threading but by kicking off multiple processes each acting on a subset of non-dependant data.

zebrabox 2010-02-15 22:16:59

The term is called "light weight process", not merely "light weight". The point is that threads are lighter in resource usage than a process is, not that threads are "light weight" in and of themselvess. I guarantee you, on every platform that supports real threads, a thread will use fewer resources than a process. Every time, because every process requires at least one thread. Thus 1000 threads will use less resources than 1000 processes.

Mystere Man 2010-02-16 19:26:18

@Mystere Man - It's true, N threads will use less memory, etc. than N processes, but there are other kinds of resource, such as wall (real) time. A multi-process design may spend less time waiting in locks to use the C heap, and so may run faster.

Daniel Earwicker 2010-02-16 19:35:32

@Earwicker - Perhaps, but then IPC between processes will eat up more resources as well. It all depends on the problem being solved.

Mystere Man 2010-02-17 03:42:56

Exactly - this is why my answer is simply that neither is absolutely more lightweight than the other in all circumstances. It depends.

Daniel Earwicker 2010-02-17 06:23:22

ansaurus

tags:

views:

answers:

Why are threads called as light weight processes

related questions