views:

66

answers:

3

I'm writing a Perl CGI script right now but it's becoming a resource hog and it keeps getting killed by my web host because I keep hitting my process memory limit. I was wondering if there is a way I can split the script I have into multiple scripts and then have the first script call the next script then exit so the entire script isn't in memory at once. I saw there is an exporter module but I don't know how to use it yet as I'm just learning Perl, and I don't think that will solve my memory problem but I might be wrong.

A: 

Yes, you can start another perl-script from a perl-script and then exit the calling script:

http://perldoc.perl.org/functions/fork.html

Example Code:

#!/usr/bin/perl

my $pid = fork();
if (not defined $pid) {
    print "resources not avilable.\n";
} elsif ($pid == 0) {
    print "IM THE CHILD\n";
    sleep 5;
    print "IM THE CHILD2\n";
    exit(0);
} else {
    print "IM THE PARENT\n";
    waitpid($pid, 0);
}
print "HIYA\n";

But this won't work, if you want the second script being able to use the CGI to communicate with your webserver/user. If you are running the perl-script as CGI, then it has to return the result to the user.

So you have two ways of dealing with this problem:

  • Try to find out, why you are using so much memory and improve the script.

  • If there is really no way to reduce memory-consumption, you can use the daemonized perl-script as worker-process, that do the calculations and returns the results to your CGI-perl-script, which has to wait for the result before termination.

Erik
With `fork()` you have now loaded two copies of the program (including all the data the program loaded before the fork). How is that going to reduce the memory footprint of the script?
mobrule
actually it does work, because the child inherits open file descriptors from the parent, so when child writes to stdout, it will go to the same stdout that the parent had, i.e the pipe the webserver has opened with the cgi script.
miedwar
fork() generates a self-running, independent process ( http://en.wikipedia.org/wiki/Fork_%28operating_system%29 ). You can start different scripts with fork, and so each part is smaller as the complete part. You then have to manage the change of the data between all processes - noone said that this is simple :)
Erik
miedwar
+2  A: 

See Watching long processes through CGI.

On the other hand, just managing memory better might also solve your problem. For example, if you are reading entire files into memory at once, try to write the script so that it handles data line-by-line or in fixed sized chunks. Declare your variables in the smallest possible scope.

Try to identify what part of your script is creating the largest memory footprint and post the relevant excerpt in a separate question for more memory management suggestions.

Sinan Ünür
A: 

If applicable, make the computation/generation off line.

create a daemon or a scheduled job that creates a static version of the results, the daemon can create a new version of the results on events (e.g files modified) or in set intervals.

If you generate the page depending on client input, look into separating the logic so it's possible to cache at least parts of the application.

Side note, unless it suites your needs, I'd move away from CGI altogether and look into mod_perl or fastcgi, where you have persistent perl processes to handle requests which saves the overhead of forking a new perl interpretor, loading modules and etc.

miedwar