views:

926

answers:

4

This is a shared hosting environment. I control the server, but not necessarily the content. I've got a client with a Perl script that seems to run out of control every now and then and suck down 50% of the processor until the process is killed.

With ASP scripts, I'm able to restrict the amount of time the script can run, and IIS will simply shut it down after, say, 90 seconds. This doesn't work for Perl scripts, since it's running as a cgi process (and actually launches an external process to execute the script).

Similarly, techniques that look for excess resource consumption in a worker process will likely not see this, since the resource that's being consumed (the processor) is being chewed up by a child process rather than the WP itself.

Is there a way to make IIS abort a Perl script (or other cgi-type process) that's running too long? How??

+1  A: 

On a UNIX-style system, I would use a signal handler trapping ALRM events, then use the alarm function to start a timer before starting an action that I expected might timeout. If the action completed, I'd use alarm(0) to turn off the alarm and exit normally, otherwise the signal handler should pick it up to close everything up gracefully.

I have not worked with perl on Windows in a while and while Windows is somewhat POSIXy, I cannot guarantee this will work; you'll have to check the perl documentation to see if or to what extent signals are supported on your platform.

More detailed information on signal handling and this sort of self-destruct programming using alarm() can be found in the Perl Cookbook. Here's a brief example lifted from another post and modified a little:

eval {
    # Create signal handler and make it local so it falls out of scope
    # outside the eval block
    local $SIG{ALRM} = sub {
        print "Print this if we time out, then die.\n";
        die "alarm\n";
    };

    # Set the alarm, take your chance running the routine, and turn off
    # the alarm if it completes.
    alarm(90);
    routine_that_might_take_a_while();
    alarm(0);
};
arclight
A: 

Googling for "iis cpu limit" gives these hits:

http://www.microsoft.com/technet/prodtechnol/WindowsServer2003/Library/IIS/38fb0130-b14b-48d5-a0a2-05ca131cf4f2.mspx?mfr=true

"The CPU monitoring feature monitors and automatically shuts down worker processes that consume large amounts of CPU time. CPU monitoring is enabled for individual application pools."

http://technet.microsoft.com/en-us/library/cc728189.aspx

"By using CPU monitoring, you can monitor worker processes for CPU usage and optionally shut down the worker processes that consume large amounts of CPU time. CPU monitoring is only available in worker process isolation mode."

piCookie
+1  A: 

The ASP script timeout applies to all scripting languages. If the script is running in an ASP page, the script timeout will close the offending page.

jwmiller5
+1  A: 

An update on this one...

It turns out that this particular script apparently is a little buggy, and that the Googlebot has the uncanny ability to "press it's buttons" and drive it crazy. The script is an older, commercial application that does calendaring. Apparently, it displays links for "next month" and "previous month", and if you follow the "next month" too many times, you'll fall off a cliff. The resulting page, however, still includes a "next month" link. Googlebot would continuously beat the script to death and chew up the processor.

Curiously, adding a robots.txt with Disallow: / didn't solve the problem. Either the Googlebot had already gotten ahold of the script and wouldn't let loose, or else it simply was disregarding the robots.txt.

Anyway, Microsoft's Process Explorer (http://technet.microsoft.com/en-us/sysinternals/bb896653.aspx) was a huge help, as it allowed me to see the environment for the perl.exe process in more detail, and I was able to determine from it that it was the Googlebot causing my problems.

Once I knew that (and determined that robots.txt wouldn't solve the problem), I was able to use IIS directly to block all traffic to this site from *.googlebot.com, which worked well in this case, since we don't care if Google indexes this content.

Thanks much for the other ideas that everyone posted!

Eric Longman

Eric Longman