tags:

views:

990

answers:

3

I'm using a simple unzip function (as seen below) for my files so I don't have to unzip files manually before they are processed further.

function uncompress($srcName, $dstName) {
    $string = implode("", gzfile($srcName));
    $fp = fopen($dstName, "w");
    fwrite($fp, $string, strlen($string));
    fclose($fp);
}

The problem is that if the gzip file is large (e.g. 50mb) the unzipping takes a large amount of ram to process.

The question: can I parse a gzipped file in chunks and still get the correct result? Or is there a better other way to handle the issue of extracting large gzip files (even if it takes a few seconds more)?

A: 

try with

function uncompress($srcName, $dstName) {
    $fp = fopen($dstName, "w");
    fwrite($fp, implode("", gzfile($srcName)));
    fclose($fp);
}

$length parameter is optional.

andres descalzo
It seems as if this approach does the same as the original approach using a large amount of memory. The whole file is being read and held in memory.
Luke
are not loaded into a variable data file (similar to streaming). is not an object model where load the object string. This example does not affect "php_value memory_limit". your example affects this variable in "php.ini" file.
andres descalzo
+1  A: 

Hi,

If you are on a Linux host, have the required privilegies to run commands, and the gzip command is installed, you could try calling it with something like shell_exec

SOmething a bit like this, I guess, would do :

shell_exec('gzip -d your_file.gz');

This way, the file wouldn't be unzip by PHP.


As a sidenote :

  • Take care where the command is run from (ot use a swith to tell "decompress to that directory")
  • You might want to take a look at escapeshellarg too ;-)
Pascal MARTIN
Thank you, I do have shell access, but have yet to learn how to use it.
Luke
+2  A: 

gzfile() is a convenience method that calls gzopen, gzread, and gzclose.

So, yes, you can manually do the gzopen and gzread the file in chunks.

This will uncompress the file in 4kB chunks:

function uncompress($srcName, $dstName) {
    $sfp = gzopen($srcName, "rb");
    $fp = fopen($dstName, "w");

    while ($string = gzread($sfp, 4096)) {
        fwrite($fp, $string, strlen($string));
    }
    gzclose($sfp);
    fclose($fp);
}
R. Bemrose
Sweet! Tested on a 1MB gzip file that extracts to 48MB- before: Process time: 12.1447s, Peak memory use: 96512kB- Your solution: Process time: 0.6705s, Peak memory use: 256kBThank you :)
Luke
You may get better performance by tweaking the number at the end of the gzread call. I haven't tried it though.
R. Bemrose
20 times better is good enough, and will remain good enough for a very long time. I would have to be very desperate or using huge files to try and tweak this thing further :)
Luke