What is the technical difference between a process and a thread? I get the feeling a word like 'process' is over used and there is also hardware and software threads. How about light-weight processes in languages like Erlang? Is there a definitive reason to use one term over the other?
Both processes and threads are independent sequences of execution. The typical difference is that threads (of the same process) run in a shared memory space, while processes run in separate memory spaces.
I'm not sure what "hardware" vs "software" threads might be referring to. Threads are an operating environment feature, rather than a CPU feature (though the CPU typically has operations that make threads efficient).
Erlang uses the term "process" because it does not expose a shared-memory multiprogramming model. Calling them "threads" would imply that they have shared memory.
An application consists of one or more processes. A process, in the simplest terms, is an executing program. One or more threads run in the context of the process. A thread is the basic unit to which the operating system allocates processor time. A thread can execute any part of the process code, including parts currently being executed by another thread. A fiber is a unit of execution that must be manually scheduled by the application. Fibers run in the context of the threads that schedule them.
Stolen from here.
Both threads and processes are atomic units of OS resource allocation (i.e. there is a concurrency model describing how CPU time is divided between them, and the model of owning other OS resources). There is a difference in:
- Shared resources (threads are sharing memory by definition, they do not own anything except stack and local variables; processes could also share memory, but there is a separate mechanism for that, maintained by OS)
- Allocation space (kernel space for processes vs. user space for threads)
Greg Hewgill above was correct about the Erlang meaning of the word "process", and here there's a discussion of why Erlang could do processes lightweight.
A process is a collection of code, memory, data and other resources. A thread is a sequence of code that is executed within the scope of the process. You can (usually) have multiple threads executing concurrently within the same process.
Process
Each process provides the resources needed to execute a program. A process has a virtual address space, executable code, open handles to system objects, a security context, a unique process identifier, environment variables, a priority class, minimum and maximum working set sizes, and at least one thread of execution. Each process is started with a single thread, often called the primary thread, but can create additional threads from any of its threads.
Thread
A thread is the entity within a process that can be scheduled for execution. All threads of a process share its virtual address space and system resources. In addition, each thread maintains exception handlers, a scheduling priority, thread local storage, a unique thread identifier, and a set of structures the system will use to save the thread context until it is scheduled. The thread context includes the thread's set of machine registers, the kernel stack, a thread environment block, and a user stack in the address space of the thread's process. Threads can also have their own security context, which can be used for impersonating clients.
Found this on MSDN here: http://msdn.microsoft.com/en-us/library/ms681917(VS.85).aspx
Microsoft Windows supports preemptive multitasking, which creates the effect of simultaneous execution of multiple threads from multiple processes. On a multiprocessor computer, the system can simultaneously execute as many threads as there are processors on the computer.
Process:
- An executing instance of a program is called a process.
- Some operating systems use the term ‘task‘ to refer to a program that is being executed.
- A process is always stored in the main memory also termed as the primary memory or random access memory.
- Therefore, a process is termed as an active entity. It disappears if the machine is rebooted.
- Several process may be associated with a same program.
- On a multiprocessor system, multiple processes can be executed in parallel.
- On a uni-processor system, though true parallelism is not achieved, a process scheduling algorithm is applied and the processor is scheduled to execute each process one at a time yielding an illusion of concurrency.
- Example: Executing multiple instances of the ‘Calculator’ program. Each of the instances are termed as a process.
Thread:
- A thread is a subset of the process.
- It is termed as a ‘lightweight process’, since it is similar to a real process but executes within the context of a process and shares the same resources allotted to the process by the kernel (See kquest.co.cc/2010/03/operating-system for more info on the term ‘kernel’).
- Usually, a process has only one thread of control – one set of machine instructions executing at a time.
- A process may also be made up of multiple threads of execution that execute instructions concurrently.
- Multiple threads of control can exploit the true parallelism possible on multiprocessor systems.
- On a uni-processor system, a thread scheduling algorithm is applied and the processor is scheduled to run each thread one at a time.
- All the threads running within a process share the same address space, file descriptor, stack and other process related attributes.
- Since the threads of a process share the same memory, synchronizing the access to the shared data withing the process gains unprecedented importance.
I borrowed the above info from the Knowledge Quest! blog available at: http://kquest.co.cc/2010/03/program-process-task-thread/
Thanks.
For a simpler explanation, I liked: Difference between a thread and process