views:

30

answers:

1

I am trying to download a file from a site using perl. I chose not to use wget so that I can learn how to do it this way. I am not sure if my page is not connecting or if something is wrong in my syntax somewhere. Also what is the best way to check if you are getting a connection to the page.

#!/usr/bin/perl -w
use strict;
use LWP;
use WWW::Mechanize;

my $mech = WWW::Mechanize->new();
$mech->credentials( '********' , '********'); # if you do need to supply server and realms use credentials like in [LWP doc][2]
$mech->get('http://datawww2.wxc.com/kml/echo/MESH_Max_180min/');
$mech->success();
if (!$mech->success()) {
    print "cannot connect to page\n";
    exit;
}
$mech->follow_link( n => 8);
$mech->save_content('C:/Users/********/Desktop/');
+3  A: 

I'm sorry but almost everything is wrong.

  • You use a mix of LWP::UserAgent and WWW::Mechanize in a wrong way. You can't do $mech->follow_link() if you use $browser->get() as you mix function from 2 module. $mech don't know that you did a request.
  • Arguments to credentials are not good, see the doc

You more probably want to do something like this:

use WWW::Mechanize;
my $mech = WWW::Mechanize->new();

$mech->credentials( '************' , '*************'); # if you do need to supply server and realms use credentials like in LWP doc
$mech->get('http://datawww2.wxc.com/kml/echo/MESH_Max_180min/');
$mech->follow_link( n => 8);

You can check result of get() and follow_link() by checking $mech->success() result if (!$mech->success()) { warn "error"; ... }
After follow->link, data is available using $mech->content(), if you want to save it in a file use $mech->save_content('/path/to/a/file')

A full code could be :

use strict;
use WWW::Mechanize;
my $mech = WWW::Mechanize->new();

$mech->credentials( '************' , '*************'); #
$mech->get('http://datawww2.wxc.com/kml/echo/MESH_Max_180min/');
die "Error: failled to load the web page" if (!$mech->success());
$mech->follow_link( n => 8);
die "Error: failled to download content" if (!$mech->success());
$mech->save_content('/tmp/mydownloadedfile')
radius
arent you still using browser->get?
shinjuo
But now how does it know which page to go to?
shinjuo
No, he is using the `credentials` method of `WWW::Mechanize`. See `http://search.cpan.org/perldoc/WWW::Mechanize#$mech-%3Ecredentials%28_$username,_$password_%29`
Sinan Ünür
I see that but he said not to use browser->get(). So where do I put the URL now?
shinjuo
I corrected, of course get is needed but from $mech and not $browser
radius
That works I think, at least no errors. What is the best way to check if it is connecting to the page and is there anyway to tell it where to download the file from the link it is following? if not where is it downloading that file?
shinjuo
You can check if get() result is ok by checking $mech->success() - if (!$mech->success()) { warn "error" } - After your follow->link, content is available using $mech->content(), if you want to save it in a file use $mech->save_content('/path/to/a/file')
radius
So if I follow a link that is a download file link, I can access it using $mech->content() or save it using $mech->save_content or is that just to save an html page?
shinjuo
When I try to use save_content it tells me unable to create. No such file or directory available. I will post my updated code above. I appreciate your help
shinjuo
Even when I type my code like yours it says unable to create /tmp/mydownloadedfile no file or directory found
shinjuo
Okay I got it to work, but not correctly. All this seems to be doing is downloading the page it is on and not downloading the file from the link. Also is there anyway to retain the name of the file it is downloading rather than making a name of my own?
shinjuo
I guess I could just date/time stamp it myself and make that the name so it is ever changing each download, but it is still not following the link it is just saving the main pages content. That does mean though it is authenticating correctly which I greatly appreciate
shinjuo