tags:

views:

39

answers:

2

How can I read file from server starting with some offset (Similar behavior to wget -c)? What headers I must send to server? What futures must server support?

+3  A: 

In http://www.gnu.org/software/wget/manual/wget.html

Note that ‘-c’ only works with ftp servers and with http servers that support the Range header.

In http://tools.ietf.org/html/rfc2616

Examples of byte-ranges-specifier values (assuming an entity-body of
length 10000):

  - The first 500 bytes (byte offsets 0-499, inclusive):  bytes=0-
    499

  - The second 500 bytes (byte offsets 500-999, inclusive):
    bytes=500-999

  - The final 500 bytes (byte offsets 9500-9999, inclusive):
    bytes=-500

  - Or bytes=9500-

  - The first and last bytes only (bytes 0 and 9999):  bytes=0-0,-1

  - Several legal but not canonical specifications of the second

500 bytes (byte offsets 500-999, inclusive): bytes=500-600,601-999 bytes=500-700,601-999

So you should send

Range:bytes=9500-

To test if a server support it you can test the accept-range as such

Origin servers that accept byte-range requests MAY send

Accept-Ranges: bytes

but are not required to do so. Clients MAY generate byte-range requests without having received this header for the resource involved. Range units are defined in section 3.12.

Servers that do not accept any kind of range request for a resource MAY send

Accept-Ranges: none

to advise the client not to attempt a range request.

Xavier Combelle
+4  A: 

You should use the Range header in the request. But you may use it only if the server informs you that it accept range request by Accept-Ranges response header.

This is an example session. Suppose we are interested in getting a part of this picture. First, we send a HTTP HEAD request to determine: a) if the server supports byte ranges, b) the content-length:

> HEAD /2238/2758537173_670161cac7_b.jpg HTTP/1.1
> Host: farm3.static.flickr.com
> Accept: */*
> 
< HTTP/1.1 200 OK
< Date: Thu, 08 Jul 2010 12:22:12 GMT
< Content-Type: image/jpeg
< Connection: keep-alive
< Server: Apache/2.0.52 (Red Hat)
< Expires: Mon, 28 Jul 2014 23:30:00 GMT
< Last-Modified: Wed, 13 Aug 2008 06:13:54 GMT
< Accept-Ranges: bytes
< Content-Length: 350015

Next, we send a GET request with the Range header asking for the first 11 bytes of the picure:

> GET /2238/2758537173_670161cac7_b.jpg HTTP/1.1
> Host: farm3.static.flickr.com
> Accept: */*
> Range: bytes=0-10
> 
< HTTP/1.1 206 Partial Content
< Date: Thu, 08 Jul 2010 12:26:54 GMT
< Content-Type: image/jpeg
< Connection: keep-alive
< Server: Apache/2.0.52 (Red Hat)
< Expires: Mon, 28 Jul 2014 23:30:00 GMT
< Last-Modified: Wed, 13 Aug 2008 06:13:54 GMT
< Accept-Ranges: bytes
< Content-Range: bytes 0-10/350015
< Content-Length: 11
< 

This is a hex dump of the first 11 bytes:

00000000  ff d8 ff e0 00 10 4a 46  49 46 00                 |......JFIF.|
0000000b

For more info see the Range header specification in HTTP RFC 2616.

Andrey Vlasovskikh
in the range header specification it is mentionned that Accept-range is optionnal a server may accept ranges without mentionning it is a bit convoluated but the only true test is to try
Xavier Combelle
@Xavier Yep, if the server doesn't accept ranges, then it may respond with 406 or 400 status code.
Andrey Vlasovskikh