tags:

views:

908

answers:

4

I want to retrieve the first 10k bytes from a URL with curl (using PHP in my case). Is there a way to specify this? I thought CURLOPT_BUFFERSIZE would do this, but it just appears to determine the size of a buffer that is reused until all of the content is retrieved.

+2  A: 

This is how i do it in c++

int offset = 0;
int size = 10*1024;

char range[256];
curl_slist_s *pHeaders = NULL;
snprintf(range, 256, "Range: bytes=%d-%d", offset, offset+size-1);

pHeaders = curl_slist_append(pHeaders, range);
curl_easy_setopt(pCurlHandle, CURLOPT_HTTPHEADER, pHeaders);

curl_slist_free_all(pHeaders);
pHeaders = NULL;

Edit: Just found out you meant in php. Ill see if i can find out how to port it.

Think this should work in php:

$offset = 0;
$size = 10*1024;

$a = $offset;
$b = $offset + $size-1;

curl_easy_setopt(curlHandle, CURLOPT_HTTPHEADER, array("Range: bytes=$a-$b") );
Lodle
Aha! A Range: header. Of course! Thanks so much.
Doug Kaye
Unfortunately testing shows that some servers ignore Range: and return the whole object. I'm surprised there isn't a way to tell curl to actually stop receiving after some max size.
Doug Kaye
It is only supported in http 1.1 so 1.0 servers will ignore it (but they should really update to 1.1!)
Lodle
Another hacky method you can do is use the curl callback (not sure if this is in php) and stop it your self after x bytes are downloaded.
Lodle
As far as I can see there is CURLOPT_RANGE option.
Milen A. Radev
A: 

CURLOPT_RANGE appears to not work in PHP although it's there. At least it didn't have an impact when I tried to use it and a google search will reveal many messages of the same.

I think it's because the "Range: bytes" header isn't honored by all servers out there.
Doug Kaye
A: 

If you use fread instead of curl, although I prefer curl, you can specify the size of the data you want to receive, for example:

$fp = @fopen($url, "r") ;

$data = "" ;
if($fp) {
    while (!feof($fp)) {
        $data .= fread($fp, $size) ;
}
fclose($fp) ;

where $size is the size you want to read in each loop.

Khriz
A: 

CURLOPT_RANGE does not work for me also. I did a verbose output and found that it uses "Content-Range:" header instead of "Range:" and hence it outputs all of the page. I guess Content-Range: is a new header which is not supported by many servers.

Vicky