views:

104

answers:

1

This is a continuation of my previous question which I posted when I wasn't a registered user. As a refresher, I'm trying to resume the downloading of a large file from my Yahoo! web site server when the download gets interrupted. I previously thought the interruption was due to a 100 second timeout limit (because Yahoo! enforces that time limit on user written scripts). However, when I measured the timing of the download interrupts, I saw that the interrupt timing varies a lot (sometimes the download runs uninterrupted for less than 100 seconds and sometimes up to seven minutes). So I don't know the reason for the timeouts and I'm just trying to work around them.

I tried the suggestion by naikus (thank you) and, according to the dump of http header fields, it appears that my Yahoo! web site server does recognize the "range" property that should allow the download to resume at the offset of the interruption. Unfortunately, although the byte range appears correct in the http header in the resumed connections, the transferred content always restarts at the beginning of the file. (My test file is an array of 50,000 4-byte integers that increments starting at 0. My downloaded file always starts recounting at 0 at every offset at which a download interrupt occurred.)

Is there some other http connection property request I should be making to get the Yahoo! server to actually skip to the file offset specified in the header's byte range? Here's the code and what it dumps:

         // Setup connection.
         URL url = new URL(strUrl[0]);
         URLConnection connection = url.openConnection();
         downloaded = Integer.parseInt(strUrl[3]);
         if (downloaded > 0) {
             connection.setRequestProperty("Range", "bytes="+downloaded+"-");
             connection.connect();
             fileLength = mDownloadFileLength;
             Log.d("AsyncDownloadFile", 
                 "new download seek: " + downloaded +
                 "; lengthFile: " + fileLength);
         }
         else {
             connection.connect();
             downloaded = 0;
             fileLength = connection.getContentLength();
             mDownloadFileLength = fileLength;
         }
         Map<String, List<String>> map = connection.getHeaderFields();
         Log.d("AsyncDownloadFile", "header fields: " + map.toString());

         // Setup streams and buffers.
         input = new BufferedInputStream(url.openStream(), 8192);
         outFile = new RandomAccessFile(strUrl[1], "rw");
         if (downloaded > 0)  
             outFile.seek(downloaded);
         byte data[] = new byte[1024];

         // Download file.
         for (int count=0, i=0; (count=input.read(data, 0, 1024)) != -1; i++) { 
             outFile.write(data, 0, count);
             downloaded += count; 
             if (downloaded >= fileLength)
                 break;

             // Display progress.
             Log.d("AsyncDownloadFile", "bytes: " + downloaded);
             if ((i%10) == 0)
                 publishProgress((int)(downloaded*100/fileLength));
             if (mFlagDisableAsyncTask) {
                 downloaded = 0;
                 break;
             }
         }

         // Close streams.
         outFile.close();
         input.close();

dump:

@ 4:08:24
D/AsyncDownloadFile( 2372): header fields: {p3p=[policyref="http://info.yahoo.co m/w3c/p3p.xml", CP="CAO DSP COR CUR ADM DEV TAI PSA PSD IVAi IVDi CONi TELo OTPi OUR DELi SAMi OTRi UNRi PUBi IND PHY ONL UNI PUR FIN COM NAV INT DEM CNT STA PO L HEA PRE LOC GOV"], content-type=[application/zip], connection=[close], last-mo dified=[Fri, 06 Aug 2010 14:47:50 GMT], content-length=[2000000], age=[0], serve r=[YTS/1.17.13], accept-ranges=[bytes], date=[Fri, 06 Aug 2010 20:08:33 GMT]}
D/AsyncDownloadFile( 2372): bytes: 1024
D/AsyncDownloadFile( 2372): bytes: 1033
D/AsyncDownloadFile( 2372): bytes: 2057
D/AsyncDownloadFile( 2372): bytes: 2493
D/AsyncDownloadFile( 2372): bytes: 3517
D/AsyncDownloadFile( 2372): bytes: 3953

. . .

@ 4:13:25
D/AsyncDownloadFile( 2372): bytes: 386473
D/AsyncDownloadFile( 2372): bytes: 387497
D/AsyncDownloadFile( 2372): bytes: 387933
D/AsyncDownloadFile( 2372): new download seek: 387933; lengthFile: 2000000
D/AsyncDownloadFile( 2372): header fields: {p3p=[policyref="http://info.yahoo.co m/w3c/p3p.xml", CP="CAO DSP COR CUR ADM DEV TAI PSA PSD IVAi IVDi CONi TELo OTPi OUR DELi SAMi OTRi UNRi PUBi IND PHY ONL UNI PUR FIN COM NAV INT DEM CNT STA PO L HEA PRE LOC GOV"], content-type=[application/zip], connection=[close], last-mo dified=[Fri, 06 Aug 2010 14:47:50 GMT], content-length=[1612067], age=[0], serve r=[YTS/1.17.13], accept-ranges=[bytes], date=[Fri, 06 Aug 2010 20:13:29 GMT], co ntent-range=[bytes 387933-1999999/2000000]}
D/AsyncDownloadFile( 2372): bytes: 388957
D/AsyncDownloadFile( 2372): bytes: 389981
D/AsyncDownloadFile( 2372): bytes: 390409
D/AsyncDownloadFile( 2372): bytes: 391433
D/AsyncDownloadFile( 2372): bytes: 391869
D/AsyncDownloadFile( 2372): bytes: 392893

. . .

@ 4:18:45
D/AsyncDownloadFile( 2372): bytes: 775413
D/AsyncDownloadFile( 2372): bytes: 775849
D/AsyncDownloadFile( 2372): bytes: 776873
D/AsyncDownloadFile( 2372): bytes: 777309
D/AsyncDownloadFile( 2372): new download seek: 777309; lengthFile: 2000000
D/AsyncDownloadFile( 2372): header fields: {p3p=[policyref="http://info.yahoo.co m/w3c/p3p.xml", CP="CAO DSP COR CUR ADM DEV TAI PSA PSD IVAi IVDi CONi TELo OTPi OUR DELi SAMi OTRi UNRi PUBi IND PHY ONL UNI PUR FIN COM NAV INT DEM CNT STA PO L HEA PRE LOC GOV"], content-type=[application/zip], connection=[close], last-mo dified=[Fri, 06 Aug 2010 14:47:50 GMT], content-length=[1222691], age=[0], serve r=[YTS/1.17.13], accept-ranges=[bytes], date=[Fri, 06 Aug 2010 20:18:54 GMT], co ntent-range=[bytes 777309-1999999/2000000]}
D/dalvikvm( 2372): GC_FOR_MALLOC freed 11019 objects / 470560 bytes in 155ms
D/AsyncDownloadFile( 2372): bytes: 778333
D/AsyncDownloadFile( 2372): bytes: 779357
D/AsyncDownloadFile( 2372): bytes: 779790
D/AsyncDownloadFile( 2372): bytes: 780814
D/AsyncDownloadFile( 2372): bytes: 781250
D/AsyncDownloadFile( 2372): bytes: 782274

. . .

@ 4:23:45
D/AsyncDownloadFile( 2372): bytes: 1163334
D/AsyncDownloadFile( 2372): bytes: 1163770
D/AsyncDownloadFile( 2372): bytes: 1164794
D/AsyncDownloadFile( 2372): bytes: 1165230
D/AsyncDownloadFile( 2372): new download seek: 1165230; lengthFile: 2000000
D/AsyncDownloadFile( 2372): header fields: {p3p=[policyref="http://info.yahoo.co m/w3c/p3p.xml", CP="CAO DSP COR CUR ADM DEV TAI PSA PSD IVAi IVDi CONi TELo OTPi OUR DELi SAMi OTRi UNRi PUBi IND PHY ONL UNI PUR FIN COM NAV INT DEM CNT STA PO L HEA PRE LOC GOV"], content-type=[application/zip], connection=[close], last-mo dified=[Fri, 06 Aug 2010 14:47:50 GMT], content-length=[834770], age=[0], server =[YTS/1.17.13], accept-ranges=[bytes], date=[Fri, 06 Aug 2010 20:23:47 GMT], con tent-range=[bytes 1165230-1999999/2000000]}
D/AsyncDownloadFile( 2372): bytes: 1166246
D/AsyncDownloadFile( 2372): bytes: 1167270
D/AsyncDownloadFile( 2372): bytes: 1167706
D/AsyncDownloadFile( 2372): bytes: 1168730
D/AsyncDownloadFile( 2372): bytes: 1169754
D/AsyncDownloadFile( 2372): bytes: 1170778

. . .

@ 4:30:25
D/AsyncDownloadFile( 2372): bytes: 1551255
D/AsyncDownloadFile( 2372): bytes: 1551691
D/AsyncDownloadFile( 2372): bytes: 1552715
D/AsyncDownloadFile( 2372): bytes: 1553151
D/AsyncDownloadFile( 2372): new download seek: 1553151; lengthFile: 2000000
D/AsyncDownloadFile( 2372): header fields: {p3p=[policyref="http://info.yahoo.co m/w3c/p3p.xml", CP="CAO DSP COR CUR ADM DEV TAI PSA PSD IVAi IVDi CONi TELo OTPi OUR DELi SAMi OTRi UNRi PUBi IND PHY ONL UNI PUR FIN COM NAV INT DEM CNT STA PO L HEA PRE LOC GOV"], content-type=[application/zip], connection=[close], last-mo dified=[Fri, 06 Aug 2010 14:47:50 GMT], content-length=[446849], age=[0], server =[YTS/1.17.13], accept-ranges=[bytes], date=[Fri, 06 Aug 2010 20:30:44 GMT], con tent-range=[bytes 1553151-1999999/2000000]}
D/AsyncDownloadFile( 2372): bytes: 1554167
D/AsyncDownloadFile( 2372): bytes: 1554184
D/AsyncDownloadFile( 2372): bytes: 1555208
D/AsyncDownloadFile( 2372): bytes: 1555644
D/AsyncDownloadFile( 2372): bytes: 1556668
D/AsyncDownloadFile( 2372): bytes: 1557104

. . .

@ 4:37:10
D/AsyncDownloadFile( 2372): bytes: 1939188
D/AsyncDownloadFile( 2372): bytes: 1939624
D/AsyncDownloadFile( 2372): bytes: 1940648
D/AsyncDownloadFile( 2372): bytes: 1941084
D/AsyncDownloadFile( 2372): new download seek: 1941084; lengthFile: 2000000
D/dalvikvm( 2372): GC_FOR_MALLOC freed 13701 objects / 604600 bytes in 128ms D/AsyncDownloadFile( 2372): header fields: {p3p=[policyref="http://info.yahoo.co m/w3c/p3p.xml", CP="CAO DSP COR CUR ADM DEV TAI PSA PSD IVAi IVDi CONi TELo OTPi OUR DELi SAMi OTRi UNRi PUBi IND PHY ONL UNI PUR FIN COM NAV INT DEM CNT STA PO L HEA PRE LOC GOV"], content-type=[application/zip], connection=[close], last-mo dified=[Fri, 06 Aug 2010 14:47:50 GMT], content-length=[58916], age=[0], server= [YTS/1.17.13], accept-ranges=[bytes], date=[Fri, 06 Aug 2010 20:37:16 GMT], cont ent-range=[bytes 1941084-1999999/2000000]}
D/AsyncDownloadFile( 2372): bytes: 1942108
D/AsyncDownloadFile( 2372): bytes: 1942117
D/AsyncDownloadFile( 2372): bytes: 1943141
D/AsyncDownloadFile( 2372): bytes: 1943577
D/AsyncDownloadFile( 2372): bytes: 1944601
D/AsyncDownloadFile( 2372): bytes: 1945037

. . .

@ 4:38:30
D/AsyncDownloadFile( 2372): bytes: 1993217
D/AsyncDownloadFile( 2372): bytes: 1994241
D/AsyncDownloadFile( 2372): bytes: 1994677
D/AsyncDownloadFile( 2372): bytes: 1995701
D/AsyncDownloadFile( 2372): bytes: 1996137
D/AsyncDownloadFile( 2372): bytes: 1997161
D/AsyncDownloadFile( 2372): bytes: 1997597
D/AsyncDownloadFile( 2372): bytes: 1998621
D/AsyncDownloadFile( 2372): bytes: 1999057
D/onPostExecute( 2372): download: unsuccessful


After the tip from BalusC (thanks), I modified the connection setup but the Yahoo! server continues to reset to the start of the file at each interrupt. Here's the changed code and the resulting dumps:

            // Setup connection.
            URL url = new URL(strUrl[0]);
            URLConnection connection = url.openConnection();
            downloaded = Integer.parseInt(strUrl[3]);
            if (downloaded == 0) {
                connection.connect();
                strLastModified = connection.getHeaderField("Last-Modified");
                fileLength = connection.getContentLength();
                mDownloadFileLength = fileLength;
            }
            else {
                connection.setRequestProperty("Range", "bytes=" + downloaded + "-");
                connection.setRequestProperty("If-Range", strLastModified);
                connection.connect();
                fileLength = mDownloadFileLength;
                Log.d("AsyncDownloadFile", 
                        "new download seek: " + downloaded +
                        "; lengthFile: " + fileLength);
            }
            map = connection.getHeaderFields();
            Log.d("AsyncDownloadFile", "header fields: " + map.toString());

dump:
@12:36:40 started
D/AsyncDownloadFile( 413): header fields: {p3p=[policyref="http://info.yahoo.c m/w3c/p3p.xml", CP="CAO DSP COR CUR ADM DEV TAI PSA PSD IVAi IVDi CONi TELo OTP OUR DELi SAMi OTRi UNRi PUBi IND PHY ONL UNI PUR FIN COM NAV INT DEM CNT STA P L HEA PRE LOC GOV"], content-type=[application/zip], connection=[close], last-m dified=[Fri, 06 Aug 2010 14:47:50 GMT], content-length=[2000000], age=[0], serv r=[YTS/1.17.13], accept-ranges=[bytes], date=[Sat, 07 Aug 2010 04:36:56 GMT]}
D/AsyncDownloadFile( 413): bytes: 1024
D/AsyncDownloadFile( 413): bytes: 2048
D/AsyncDownloadFile( 413): bytes: 2476
D/AsyncDownloadFile( 413): bytes: 3500
D/AsyncDownloadFile( 413): bytes: 3936

...

@12:39:20 interrupted
D/AsyncDownloadFile( 413): bytes: 388068
D/AsyncDownloadFile( 413): bytes: 389092
D/AsyncDownloadFile( 413): bytes: 389376
D/AsyncDownloadFile( 413): new download seek: 389376; lengthFile: 2000000
D/AsyncDownloadFile( 413): header fields: {p3p=[policyref="http://info.yahoo.co m/w3c/p3p.xml", CP="CAO DSP COR CUR ADM DEV TAI PSA PSD IVAi IVDi CONi TELo OTPi OUR DELi SAMi OTRi UNRi PUBi IND PHY ONL UNI PUR FIN COM NAV INT DEM CNT STA PO L HEA PRE LOC GOV"], content-type=[application/zip], connection=[close], last-mo dified=[Fri, 06 Aug 2010 14:47:50 GMT], content-length=[1610624], age=[0], serve r=[YTS/1.17.13], accept-ranges=[bytes], date=[Sat, 07 Aug 2010 04:39:21 GMT], co ntent-range=[bytes 389376-1999999/2000000]}
D/AsyncDownloadFile( 413): bytes: 390400
D/AsyncDownloadFile( 413): bytes: 390409
D/AsyncDownloadFile( 413): bytes: 391433
D/AsyncDownloadFile( 413): bytes: 391869

...

@12:44:10 interrupted
D/AsyncDownloadFile( 413): bytes: 775413
D/AsyncDownloadFile( 413): bytes: 775849
D/AsyncDownloadFile( 413): bytes: 776873
D/AsyncDownloadFile( 413): bytes: 777309
D/AsyncDownloadFile( 413): new download seek: 777309; lengthFile: 2000000
D/AsyncDownloadFile( 413): header fields: {p3p=[policyref="http://info.yahoo.co m/w3c/p3p.xml", CP="CAO DSP COR CUR ADM DEV TAI PSA PSD IVAi IVDi CONi TELo OTPi OUR DELi SAMi OTRi UNRi PUBi IND PHY ONL UNI PUR FIN COM NAV INT DEM CNT STA PO L HEA PRE LOC GOV"], content-type=[application/zip], connection=[close], last-mo dified=[Fri, 06 Aug 2010 14:47:50 GMT], content-length=[1222691], age=[0], serve r=[YTS/1.17.13], accept-ranges=[bytes], date=[Sat, 07 Aug 2010 04:44:20 GMT], co ntent-range=[bytes 777309-1999999/2000000]}
D/dalvikvm( 413): GC_FOR_MALLOC freed 10869 objects / 465664 bytes in 122ms
D/AsyncDownloadFile( 413): bytes: 778333
D/AsyncDownloadFile( 413): bytes: 778342
D/AsyncDownloadFile( 413): bytes: 779366
D/AsyncDownloadFile( 413): bytes: 779802

...

@12:49:30 interrupted
D/AsyncDownloadFile( 413): bytes: 1163782
D/AsyncDownloadFile( 413): bytes: 1164806
D/AsyncDownloadFile( 413): bytes: 1165242
D/AsyncDownloadFile( 413): new download seek: 1165242; lengthFile: 2000000
D/AsyncDownloadFile( 413): header fields: {p3p=[policyref="http://info.yahoo.co m/w3c/p3p.xml", CP="CAO DSP COR CUR ADM DEV TAI PSA PSD IVAi IVDi CONi TELo OTPi OUR DELi SAMi OTRi UNRi PUBi IND PHY ONL UNI PUR FIN COM NAV INT DEM CNT STA PO L HEA PRE LOC GOV"], content-type=[application/zip], connection=[close], last-mo dified=[Fri, 06 Aug 2010 14:47:50 GMT], content-length=[834758], age=[0], server =[YTS/1.17.13], accept-ranges=[bytes], date=[Sat, 07 Aug 2010 04:49:43 GMT], con tent-range=[bytes 1165242-1999999/2000000]}
D/AsyncDownloadFile( 413): bytes: 1166266
D/AsyncDownloadFile( 413): bytes: 1167290
D/AsyncDownloadFile( 413): bytes: 1167718
D/AsyncDownloadFile( 413): bytes: 1168742

...

@12:55:30 interrupted
D/AsyncDownloadFile( 413): bytes: 1552722
D/AsyncDownloadFile( 413): bytes: 1553158
D/AsyncDownloadFile( 413): bytes: 1554182
D/AsyncDownloadFile( 413): bytes: 1554618
D/AsyncDownloadFile( 413): new download seek: 1554618; lengthFile: 2000000
D/AsyncDownloadFile( 413): header fields: {p3p=[policyref="http://info.yahoo.co m/w3c/p3p.xml", CP="CAO DSP COR CUR ADM DEV TAI PSA PSD IVAi IVDi CONi TELo OTPi OUR DELi SAMi OTRi UNRi PUBi IND PHY ONL UNI PUR FIN COM NAV INT DEM CNT STA PO L HEA PRE LOC GOV"], content-type=[application/zip], connection=[close], last-mo dified=[Fri, 06 Aug 2010 14:47:50 GMT], content-length=[445382], age=[0], server =[YTS/1.17.13], accept-ranges=[bytes], date=[Sat, 07 Aug 2010 04:55:39 GMT], con tent-range=[bytes 1554618-1999999/2000000]}
D/AsyncDownloadFile( 413): bytes: 1555642
D/AsyncDownloadFile( 413): bytes: 1556666
D/AsyncDownloadFile( 413): bytes: 1557094
D/AsyncDownloadFile( 413): bytes: 1558118

...

@12:57:20 interrupted
D/AsyncDownloadFile( 413): bytes: 1941834
D/AsyncDownloadFile( 413): bytes: 1942858
D/AsyncDownloadFile( 413): bytes: 1943882
D/AsyncDownloadFile( 413): bytes: 1943994
D/AsyncDownloadFile( 413): new download seek: 1943994; lengthFile: 2000000
D/AsyncDownloadFile( 413): header fields: {p3p=[policyref="http://info.yahoo.co m/w3c/p3p.xml", CP="CAO DSP COR CUR ADM DEV TAI PSA PSD IVAi IVDi CONi TELo OTPi OUR DELi SAMi OTRi UNRi PUBi IND PHY ONL UNI PUR FIN COM NAV INT DEM CNT STA PO L HEA PRE LOC GOV"], content-type=[application/zip], connection=[close], last-mo dified=[Fri, 06 Aug 2010 14:47:50 GMT], content-length=[56006], age=[0], server= [YTS/1.17.13], accept-ranges=[bytes], date=[Sat, 07 Aug 2010 04:57:15 GMT], cont ent-range=[bytes 1943994-1999999/2000000]}
D/dalvikvm( 413): GC_FOR_MALLOC freed 13617 objects / 602200 bytes in 165ms
D/AsyncDownloadFile( 413): bytes: 1945018
D/AsyncDownloadFile( 413): bytes: 1946042
D/AsyncDownloadFile( 413): bytes: 1946470
D/AsyncDownloadFile( 413): bytes: 1947494

...

@12:58:10 finished
D/AsyncDownloadFile( 413): bytes: 1996103
D/AsyncDownloadFile( 413): bytes: 1997127
D/AsyncDownloadFile( 413): bytes: 1997563
D/AsyncDownloadFile( 413): bytes: 1998587
D/AsyncDownloadFile( 413): bytes: 1999023
D/onPostExecute( 413): downloaded: unsuccessful

+1  A: 

To resume a download, you need to send not only the Range request header, but also the If-Range request header which should contain either the unique file identifier or the file modification timestamp.

If the server returns an ETag response header on the initial download, then you should use it in the If-Range header of the subsequent resume requests. Or if it returns a Last-Modified response header, then you should use it in the If-Range request header instead.

Looking at your logs, the server has sent a Last-Modified response header. So you should send it back along in an If-Range header of the resume request.

// Initial download.
String lastModified = connection.getHeaderField("Last-Modified");

// ...

// Resume download.
connection.setRequestProperty("If-Range", lastModified); 

The server will use this information to verify if you're requesting exactly the same file.

BalusC
Thanks for the suggestion. On a whim, I also tried changing the extension of the downloaded filename from .zip (i.e., content-type=[application/zip]) to .dat (i.e., content-type=[application/octet-stream]), and I tried changing the request property keys to lower case. Unfortunately, the Yahoo! server continues to resume downloads at the beginning of the file.
gregS
To no avail, I also tried changingconnection.setRequestProperty("Range", "bytes="+downloaded+"-");to connection.setRequestProperty("Range", "bytes=-"+(mDownloadFileLength - downloaded));
gregS
I received a reply from Yahoo! tech support saying that the Yahoo! servers do not support byte range requests: "Yahoo! Web Hosting does not support Accept-range header since we work with a pool of servers and each request potentially reaches a different server. You will see connection=[close] in the response header indicating this."
gregS