ansaurus

Question

How can I avoid zombies in Perl CGI scripts run under Apache 1.3?

Answer 1

+7 A:

More Band-Aid than best practice, but sometimes you can get away with simple

$SIG{CHLD} = "IGNORE";

According to the perlipc documentation

On most Unix platforms, the CHLD (sometimes also known as CLD) signal has special behavior with respect to a value of 'IGNORE'. Setting $SIG{CHLD} to 'IGNORE' on such a platform has the effect of not creating zombie processes when the parent process fails to wait() on its child processes (i.e., child processes are automatically reaped). Calling wait() with $SIG{CHLD} set to 'IGNORE' usually returns -1 on such platforms.

If you care about the exit statuses of child processes, you need to collect them (commonly referred to as "reaping") by calling wait or waitpid. Despite the creepy name, a zombie is merely a child process that has exited but whose status has not yet been reaped.

If your Perl programs themselves are the child processes becoming zombies, that means their parents (the ones that are forking-and-forgetting your code) need to clean up after themselves. A process cannot stop itself from becoming a zombie.

Greg Bacon 2010-01-07 17:14:34

Thanks very much G.I'm not saying I understand how it works but I will read more about it. I assume CHLD goes into the calling script. Is that right?

G Berdal 2010-01-07 17:30:16

waitpid(-1, WNOHANG) won't block, so it can be called periodically to collect child exit status. Use a loop like this to reap all your zombies: while (($pid = waitpid(-1, WNOHANG)) > 0) ...

Ken Fox 2010-01-07 17:34:18

@G Berdal How are your scripts being started? Do you control that code?

Greg Bacon 2010-01-07 18:08:29

@gbacon They are server side includes on pages of a website. Most of them calls a library for functions. The funny thing is that the SSI scripts are the ones becoming zombies according to the logs.

G Berdal 2010-01-07 18:10:53

Which web server are you running? (I assume apache, which I would expect to reap children's exit statuses correctly!) Please provide representative samples of the log messages you're seeing related to zombies. You might want to edit your question with to include that information. The log messages will be much more readable there than in a comment.

Greg Bacon 2010-01-07 18:57:35

I'm using apache 1.3. - I have added the logs.

G Berdal 2010-01-07 19:03:45

You wrote that this happens under heavy load. When traffic backs off, does apache finally reap the zombies, or do they hang around until you restart apache? If the former, the machine could be too busy serving requests to get around to reaping. The latter is likely a bug somewhere out of your control. Either way, the kind folks over at Server Fault will probably be more help to you, and we could move your question there if you'd like.

Greg Bacon 2010-01-07 21:54:29

+1 That is a very good question. I will ask the Administrator to check that out for me. I think it is a bit premature to say that it is solely a server issue. I wish it was, then I could simply pass it on to the Administrator. :)

G Berdal 2010-01-07 22:19:36

Answer 2

A:

As you have all the bits yourself, I'd suggest running the individual scripts one at a time from the command line to see if you can spot the ones that are hanging.

Does a ps listing show an inordinate number of instances of one particular script running?

Are you running the CGI's using mod_perl?

Edit: Just saw your comments regarding SSI's. Don't forget that SSI directives can run Perl scripts themselves. Have a look to see what the CGI's are trying to run?

Are they dependent on yet another server or service?

Rob Wells 2010-01-07 18:25:31

I have spotted the ones that are hanging. Nearly all of the SSI on the webpage become multiple zombie instances. They are calling a library for functions. What I am not sure about who counts as a parent for these?

G Berdal 2010-01-07 18:30:11

The process which called the SSI or CGI is its parent. You could try using `ps` to look up the ppid (parent process id) and then seeing what that process is, but I'm not positive offhand whether `ps` will return a ppid for zombies. (Seems like it should, since the zombie has to know who's supposed to reap it, I just haven't verified that it does work.) For me, the output of `ps -l`includes ppid; check your local man page if your `ps` behaves differently.

Dave Sherohman 2010-01-08 10:31:28

@Dave, when a process is a zombie it has no ppid by definition.

Rob Wells 2010-01-08 15:10:00

@Dave, my bad. An orphan process has no ppid by definition. A zombie process has exited but has not yet been reaped.

Rob Wells 2010-01-09 15:25:27

Answer 3

+2 A:

I just saw your comment that you are running Apache 1.3 and that may be associated with your problem.

SSI's can run CGI's. But CGI scripts that generate SSI's will not have those SSI's handled. The evaluation of SSI's happens before the running of CGI's in the Apache 1.3 request cycle. This was fixed with Apache 2.0 and later so that CGI's can generate SSI commands.

As I'd suggested above, try running your scripts on their own and have a look at the output. Are they generating SSI's?

Edit: Have you tried launching a trivial Perl CGI script to simply printout a Hello World type HTTP response?

Then if this works add a trivial SSI directives such as

<!--#printenv -->

and see what happens.

Edit 2: Just realised what is probably happening. Zombies occur when a child process exits and isn't reaped. These processes are hanging around and slowly using up resources within the process table. A process without a parent is an orphaned process.

Are you forking off processes within your Perl script? If so, have you added a waitpid() call to the parent?

Have you also got the correct exit within the script?

CORE::exit(0);

Rob Wells 2010-01-07 21:25:57

I ran all scripts through the debugger and elminated all errors and warnings. They are generating output properly. We were about to upgrade to 2.0 anyway. Do you think that would help?

G Berdal 2010-01-07 22:17:26

Ok. Good work with the debugger to eliminate all errors and warnings. Are any of your Perl CGIs running successfully to completion at all?

Rob Wells 2010-01-08 00:05:46

As I said they are running just fine. Apart from the fact that they become zombies they are running perfectly.

G Berdal 2010-01-08 08:02:57

@George, they are not running fine if you are getting zombie processes. These processes will hang around and slowly consume the resources of your server.

Rob Wells 2010-01-08 10:33:52

@Rob, could using POSIX:_exit(0); help? According to http://perldoc.perl.org/functions/exit.html that avoids END routines and destruction processing. I'm thinking gbacon might be right and the server simply doesn't have time to do garbage collection during busy periods...

G Berdal 2010-01-08 13:00:41

ansaurus

tags:

views:

answers:

How can I avoid zombies in Perl CGI scripts run under Apache 1.3?

related questions