views:

505

answers:

2

I'm working with PHP and need to parse a number of fairly large XML files (50-75MB uncompressed). The issue, however, is that these XML files are stored remotely and will need to be downloaded before I can parse them.

Having thought about the issue, I think using a system() call in PHP in order to initiate a cURL transfer is probably the best way to avoid timeouts and PHP memory limits.

Has anyone done anything like this before? Specifically, what should I pass to cURL to download the remote file and ensure it's saved to a local folder of my choice?

+1  A: 

you can try this:

function download($src, $dst) {
        $f = fopen($src, 'rb');
        $o = fopen($dst, 'wb');
        while (!feof($f)) {
            if (fwrite($o, fread($f, 2048)) === FALSE) {
                   return 1;
            }
        }
        fclose($f);
        fclose($o);
        return 0;
}
download($url,$target);
if ( file_exists($target) ){
   # do your stuff
}
ghostdog74
This works, but is obviously subject to PHP timeouts - which is no good in this situation.
ndg
A: 

There is no reason why you cannot use the curl API from inside your code. You don't need to spawn a separate process. You can even use fopen because it has HTTP URL wrapper. In other words, fopen can be used on remote files on website in exactly the same way that you use it on local files.

e4c5