views:

280

answers:

2

When using the perl module LWP::Simple, is there a simple way to determine the speed and amount downloaded by a single getstore() invocation? This would be useful for observing the status of large file downloads.

Off the top of my head, one approach would be to:

  1. store the current time (time0)
  2. run getstore in a new process
  3. poll the known destination file
    • the amount downloaded would be the current file size (size)
    • the download speed would (size / current_time - time0)

I'm wondering if there's a simpler way.

Alternative suggestions welcome (perhaps I should use a different module?)

+8  A: 

Instead of using LWP::Simple, use LWP::UserAgent directly. For a starting point, look at how LWP::Simple::getstore initializes a $ua and invokes request. You'll want to call $ua->add_handler to specify a response_data handler to do whatever you want; by default (at least for the HTTP protocol) LWP::UserAgent will be reading up to 4Kb chunks and call the response_data handler for each chunk, but you can suggest a different size in the request method parameters.

You may want to specify other handlers too, if you want to differentiate between header data and actual data that will be stored in the file or do something special if there are redirects.

ysth
Thanks! I knew there had to be a better way.
vlee
+3  A: 

Unless you have other requirements (such as watching the rate and size during the download), the steps that you outlined are the easiest to think about and implement.

You can export the underlying user-agent object in LWP::Simple. If you just want to watch the download for a one-off, you can set the show_progress bit of the user-agent:

 use LWP::Simple qw($ua getstore);

 $ua->show_progress(1);

 getstore( 
'http://www.theperlreview.com/Issues/subscribers.html',
'subscribers.html'
);

To do more work, you can use LWP::Simple and still do the same thing ysth suggests:

 use LWP::Simple qw($ua);

 $ua->response_header( 
      sub { 
          my($response, $ua, $h) = @_; 
          ... 
          }
      );

In that subroutine, you read the data and do whatever you like with it, including keeping a timer. Once you get your answer, you can delete that bit of code and go back to just getstore.

Flavio Poletti wrote "Watching LWP's Activity" for The Perl Review, Spring 2009 and shows many uses of this technique.

brian d foy
Thanks for the great response and examples. Flavio Poletti's one page write-up was also useful and concise.
vlee