Hi
Wonder if anyone can help me out with a little cron issue i am experience
The problem is that the load can spike up to 5, and the CPU usage can jump to 40%, on a dual core 'Xeon L5410 @ 2.33GHz' with 356Mb RAM, and I'm not sure where I should be tweaking the code and which way to prevent that. code sample below
//Note $productFile can be 40Mb .gz compressed, 700Mb uncompressed (xml text file) if (file_exists($productFile)) {
$fResponse = gzopen($productFile, "r");
if ($fResponse) {
while (!gzeof($fResponse)) {
$sResponse = "";
$chunkSize = 10000;
while (!gzeof($fResponse) && (strlen($sResponse) < $chunkSize)) {
$sResponse .= gzgets($fResponse, 4096);
}
$new_page .= $sResponse;
$sResponse = "";
$thisOffset = 0;
unset($matches);
if (strlen($new_page) > 0) {
//Emptying
if (!(strstr($new_page, "<product "))) {
$new_page = "";
}
while (preg_match("/<product [^>]*>.*<\/product>/Uis", $new_page, $matches, PREG_OFFSET_CAPTURE, $thisOffset)) {
$thisOffset = $matches[0][1];
$thisLength = strlen($matches[0][0]);
$thisOffset = $thisOffset + $thisLength;
$new_page = substr($new_page, $thisOffset-1);
$thisOffset = 0;
$new_page_match = $matches[0][0];
//- Save collected data here -//
}
}//End while loop
}
}
gzclose($fResponse);
}
}
$chunkSize - should it be as small as possible to keep the memory usage down and ease the regular expression, or should it be larger to avoid the code taking too long to run.
With 40,000 matches the load/CPU spikes. So does anyone have any advice on how to manage large feed uploads via crons.
Thanks in advance for your help