views:

314

answers:

6

I've read about fork and from what I understand, the process is cloned but which process? The script itself or the process that launched the script?

For example:

I'm running rTorrent on my machine and when a torrent completes, I have a script run against it. This script fetches data from the web so it takes a few seconds to complete. During this time, my rtorrent process is frozen. So I made the script fork using the following

my $pid = fork();
if ($pid == 0) { blah blah blah; exit 0; }

If I run this script from the CLI, it comes back to the shell within a second while it runs in the background, exactly as I intended. However, when I run it from rTorrent, it seems to be even slower than before. So what exactly was forked? Did the rtorrent process clone itself and my script ran in that, or did my script clone itself? I hope this makes sense.

+6  A: 

The fork() function returns TWICE! Once in the parent process, and once in the child process. In general, both processes are IDENTICAL in every way, as if EACH one had just returned from fork(). The only difference is that in one, the return value from fork() is 0, and in the other it is non-zero (the PID of the child process).

So whatever process was running your Perl script (if it is an embedded Perl interpreter inside rTorrent then rTorrent would be the process) would be duplicated at exactly the point that the fork() happened.

Adam Batkin
I don't think this is really addressing his question...
jdizzle
@jdizzle - Probably because the question doesn't make much sense, because `somebody` doesn't understand the process and forking ideas. Explaining some facts might help :)
viraptor
@viraptor - I feel somebody has a good enough grasp of fork()ing. The question is really about rTorrent's implementation.
jdizzle
The question asks "which one is cloned" and the answer is "whatever process the `fork()` runs in" (plus some extra explanation of how `fork()` works which should help understanding of why it all happens that way)
Adam Batkin
+3  A: 

The entire process containing the interpreter forks. Fortunately memory is copy-on-write so it doesn't need to copy all the process memory in order to fork. However, things such as file descriptors remain open. This allows child processes to handle them, but may cause issues if they aren't closed appropriately. In general, fork() should not be used in an embedded interpreter except under extreme duress.

Ignacio Vazquez-Abrams
Meh. It's not like this is the end of the world to fork() in perl on an end-user's machine. I agree that it is probably bad practice to use often (as it's a ripe point for a bottleneck).
jdizzle
If it's a bad practice, is there an alternative method to prevent blocking?
somebody
+2  A: 

My advice would be "don't do that".

If the Perl interpreter is embedded within the rtorrent process, you've almost certainly forked an entire rtorrent process, the effects of which are probably ill-defined at best. It's generally a bad idea to play with process-level stuff in an embedded interpreter regardless of language.

There's an excellent chance that some sort of lock is not being properly released, or that threads within the processes are proceeding in unintended and possibly competing ways.

Nicholas Knight
How common is it to actually link against the perl interpreter? Wouldn't it be much more practical (and safe) to system() these kinds of calls?
jdizzle
True that calling `fork()` in a multi-threaded program is asking for trouble. If you restrict what happens in the child process, though, it's not so bad. For example, call nothing that needs to acquire a user-mode lock. But the typical usage of following `fork()` with `dup2()`, `close()`, `execve()` etc. should be safe.
asveikau
@jdizzle: Linking against an external interpreter, Perl or otherwise, is very common, but it's not entirely clear from the question whether that is the case or not with this program. On re-reading though, you may be right that system() is being used.
Nicholas Knight
It's not an embedded interpreter. I'm making it launch an external script, so it can be perl, python, whatever.
somebody
+2  A: 

I believe I found the problem by looking through rTorrent's source. For some processes, it will read all of the output sent to stdout before continuing. If this is happening to your process, rTorrent will block until you close the stdout process. Because you're forking, your child process shares the same stdout as the parent. Your parent process will exit, but the pipe remains open (because your child process is still running). If you did an strace of rTorrent, I'd bet that it'd be blocked on this read() call while executing your command.

Try closing/redirecting stdout in your perl script before the fork().

jdizzle
Solves the problem, but doesn't answer the nominal question. I'd like that.
darch
@darch - the title of the question is actually relevant to the problem somebody's trying to solve
jdizzle
+2  A: 

To answer the nominal question, since you commented that the accepted answer fails to do so, fork affects the process in which it is called. In your example of rTorrent spawning a Perl process which then calls fork, it is the Perl process which is duplicated, since it was the Perl process which called fork.

In the general case, there is no way for a process to fork any process other than itself. If it were possible to tell another arbitrary process to go fork itself, that would open up no end of security and performance issues.

Dave Sherohman
Besides opening up the possibility for lots of jokes: "hey you! go fork yourself!" "no, fork you!"
Ether
+1  A: 

When we create a process using fork the child process will have the copy of the address space.So the child also can use the address space.And it also can access the files which is opened by the parent.We can have the control over the child.To get the complete status of the child we can use wait.

karthi_ms