tags:

views:

1245

answers:

8

In many programs and man pages of Linux, I have seen code using fork(). Why do we need to use fork() and what is its purpose?

+1  A: 

fork() will create a new child process identical to the parent. So everything you run in the code after that will be run by both processes — very useful if you have for instance a server, and you want to handle multiple requests.

cloudhead
why do u create a child which is identical to the parent what is the use?
It's just like building an army vs a single soldier. You fork so that your program can handle more requests at the same time, instead of one by one.
cloudhead
fork() returns 0 on the child and the pid of the child on the parent. The child can then use a call like exec() to replace its state with a new program. This is how programs are launched.
tgamblin
The processes are very close to identical, but there are a lot of subtle differences. The blatant differences are current PID and parent PID. There are issues related to locks held and semaphores held. The fork() manual page for POSIX lists 25 differences between the parent and the child.
Jonathan Leffler
@kar: Once you have two processes, they can continue separately from there and one of them could replace itself (exex()) with another program entirely.
Vatine
+2  A: 

fork() is how Unix create new processes. At the point you called fork(), your process is cloned, and two different processes continue the execution from there. One of them, the child, will have fork() return 0. The other, the parent, will have fork() return the PID (process ID) of the parent.

For example, if you type the following in a shell, the shell program will call fork(), and then execute the command you passed (telnetd, in this case) in the child, while the parent will display the prompt again, as well as a message indicating the PID of the background process.

$ telnetd &

As for the reason you create new processes, that's how your operating system can do many things at the same time. It's why you can run a program and, while it is running, switch to another window and do something else.

Daniel
+26  A: 

fork() is how you create new processes in Unix. When you call fork, you're creating a copy of your own process that has its own address space. This allows multiple tasks to run independently of one another as though they each had the full memory of the machine to themselves.

Here are some example usages of fork:

  1. Your shell uses fork to run the programs you invoke from the command line.
  2. Web servers like apache use fork to create multiple server processes, each of which handles requests in its own address space. If one dies or leaks memory, others are unaffected, so it functions as a mechanism for fault tolerance.
  3. Google Chrome uses fork to handle each page within a separate process. This will prevent client-side code on one page from bringing your whole browser down.
  4. fork is used to spawn processes in some parallel programs (like those written using MPI). Note this is different from using threads, which don't have their own address space and exist within a process.
  5. Scripting languages use fork indirectly to start child processes. For example, every time you use a command like subprocess.Popen in Python, you fork a child process and read its output. This enables programs to work together.

Typical usage of fork in a shell might look something like this:

int child_process_id = fork();
if (child_process_id) {
    // Fork returns a valid pid in the parent process.  Parent executes this.

    // wait for the child process to complete
    waitpid(child_process_id, ...);  // omitted extra args for brevity

    // child process finished!
} else {
    // Fork returns 0 in the child process.  Child executes this.

    // new argv array for the child process
    const char *argv[] = {"arg1", "arg2", "arg3", NULL};

    // now start executing some other program
    exec("/path/to/a/program", argv);
}

The shell spawns a child process using exec and waits for it to complete, then continues with its own execution. Note that you don't have to use fork this way. You can always spawn off lots of child processes, as a parallel program might do, and each might run a program concurrently. Basically, any time you're creating new processes in a Unix system, you're using fork(). For the Windows equivalent, take a look at CreateProcess.

If you want more examples and a longer explanation, Wikipedia has a decent summary. And here are some slides here on how processes, threads, and concurrency work in modern operating systems.

tgamblin
Bullet 5: 'often'? Only 'often'? Which ones don't use it, or under what circumstances is fork() not used - on systems that support fork(), that is.
Jonathan Leffler
"often" is basically me being cautious because I don't know off the top of my head what the equivalent of fork() is on Windows. Obviously they need to use some sort of process creation, I just don't know offhand what it's called outside of Unix.
tgamblin
Strangely enough, it's called CreateProcess() - those crazy Windows guys :-)
paxdiablo
Thanks -- edited for more precise wording :-).
tgamblin
never realized up-till now that "shell uses fork to run the programs you invoke from the command line"!
Lazer
+4  A: 

fork() is used to create child process. When a fork() function is called, a new process will be spawned and the fork() function call will return a different value for the child and the parent.

If the return value is 0, you know you're the child process and if the return value is a number (which happens to be the child process id), you know you're the parent. (and if it's a negative number, the fork was failed and no child process was created)

http://www.yolinux.com/TUTORIALS/ForkExecProcesses.html

Wadih M.
Unless the return value is -1, in which case the fork() failed.
Jonathan Leffler
Jonathan, I've updated my answer to mention that case too.
Wadih M.
A: 

fork() is used to spawn a child process. Typically it's used in similar sorts of situations as threading, but there are differences. Unlike threads, fork() creates whole seperate processes, which means that the child and the parent while they are direct copies of each other at the point that fork() is called, they are completely seperate, neither can access the other's memory space (without going to the normal troubles you go to access another program's memory).

fork() is still used by some server applications, mostly ones that run as root on a *NIX machine that drop permissions before processing user requests. There are some other usecases still, but mostly people have moved to multithreading now.

Matthew Scharley
I don't understand the perception that "most people" have moved to multithreading. Processes are here to stay, and so are threads. No one has "moved on" from either. In parallel programming, the largest and most concurrent codes are distributed-memory multi-process programs (e.g. MapReduce and MPI). Still, most people would opt for OpenMP or some shared-memory paradigm for a multicore machine, and GPUs are using threads these days, but there is lots beyond that. I bet, though, that more coders on this site encounter process parallelism on the server side than anything multithreaded.
tgamblin
+1  A: 

Multiprocessing is central to computing. For example, your IE or Firefox can create a process to download a file for you while you are still browsing the internet. Or, while you are printing out a document in a word processor, you can still look at different pages and still do some editing with it.

動靜能量
+1  A: 

fork() is basically used to create a child process for the process in which you are calling this function. Whenever you call a fork(), it returns a zero for the child id.

pid=fork()
if pid==0
//this is the child process
else if pid!=0
//this is the parent process

by this you can provide different actions for the parent and the child and make use of multithreading feature.

Nave
A: 

You probably don't need to use fork in day-to-day programming if you are writing applications.

Even if you do want your program to start another program to do some task, there are other simpler interfaces which use fork behind the scenes, such as "system" in C and perl.

For example, if you wanted your application to launch another program such as bc to do some calculation for you, you might use 'system' to run it. System does a 'fork' to create a new process, then an 'exec' to turn that process into bc. Once bc completes, system returns control to your program.

You can also run other programs asynchronously, but I can't remember how.

If you are writing servers, shells, viruses or operating systems, you are more likely to want to use fork.

Alex Brown