views:

446

answers:

6

Is there a way in WWW::Mechanize or any Perl module to read on a file after accessing a website. For example, I clicked a button 'Receive', and a file (.txt) will appear containing a message. How will I be able to read the content? Answers are very much appreciated.. I've been working on this for days,, Also, I tried all the possibilities. Can anyone help? If you can give me an idea please? :)

Here is a part of my code:

...

my $username = "admin";<br>
my $password = "12345";<br>

my $url = "http://...do_gsm_sms.cgi";

my $mech = WWW::Mechanize->new(autocheck => 1, quiet => 0, agent_alias =>$login_agent, cookie_jar => $cookie_jar);

$mech->credentials($username, $password);<br>
$mech->get($url);

$mech->success() or die "Can't fetch the Requested page";<br>

print "OK! \n"; #This works <br> 

$mech->form_number(1);

$mech->click()

;

After this, 'Downloads' dialog box will appear so I can save the file (but I can also set the default to open it immediately instead of saving). Question is, how can I read the content of this file?

..

+2  A: 

After the click (assuming that's doing what it's supposed to), the returned data should be stored in your $mech object. You should be able to get the file data with $mech->content(), perhaps after verifying success with $mech->status() and the type of response with $mech->content_type().

You may find it helpful to remember that WWW::Mechanize replaces the browser; anything a browser would have done, like bringing up a download window and saving a file, doesn't actually happen, but all the information the browser would have had is accessible through WWW::Mechanize's methods.

ysth
Thanks! I tried $mech->content(), the content type displays 'text/html'. The content still displays the html codes,,and not the textfile itself. :(
Suezy
@Suezy: then your click isn't doing what you want. You may need to do one of `click_button( name => 'somename' )`, `click_button( number => somenumber )`, or `click_button( value => 'somevalue' )` or change the form number you are using. At some point, you may need to share some of the html of the form you are trying to automate to get better answers.
ysth
Maybe am just having problem with clicking the button, I don't know if it opens the file at all. The page contains 'submit' type w/o name, not a button (and not inside a form), so I used mech->submit() instead. The content_type still shows the HTML codes. hmm.... what am i missing?
Suezy
@Suezy: you need to call the right method(s) to do what clicking Receive does. You should be able to figure out what that is if you carefully read the WWW::Mechanize doc. If you can't, you are going to need to show us the html of the page to get help.
ysth
@Suezy: Thanks for accepting my answer; hope that means you got it working! If not, *please* ask for more help.
ysth
He might get HTML that then refreshes or uses javascript to redirect to the actual file. It's hard to say without seeing what he's seeing.
brian d foy
+1  A: 

Dare I ask... have you tried this?

my $content = $mech->content();
pioto
I've tried. but the content displayed is the HTML codes, instead of the content of the downloaded textfile. :(
Suezy
What sort of "HTML codes" is it giving you?
pioto
One exactly the same as the one seen in 'Page Source'
Suezy
Hrm. Sometimes, some web forms will just show the same form again if they don't get all the input they expect.You may find this useful to see what mechanize sees, and compare that to what you see in the browser: `$mech->dump_forms();`I take it there's no error message included in that HTML returned from the form?
pioto
+1  A: 

I take you mean that the web site responds to the form submission by returning a non-HTML response (say a 'text/plain' file), that you wish to save.

I believe you want $mech->save_content( $filename )

Added:

First you need to submit the WWW:Mech's form submission, before saving the resulting (text) file. The click is for clicking a button, whereas you want to submit a form, using $mech->submit() or $mech->submit_form( ... ).

#!/usr/bin/perl

use strict;
use warnings;

use WWW::Mechanize;

my $username = "admin";
my $password = "12345";
my $login_agent = 'WWW::Mechanize login-agent';
my $cookie_jar;

#my $url = "http://localhost/cgi-bin/form_mech.pl";
my $url = "http://localhost/form_mech.html";

my $mech = WWW::Mechanize->new(autocheck => 1, quiet => 0, 
               agent_alias => $login_agent, cookie_jar => $cookie_jar
           );

$mech->credentials($username, $password);
$mech->get($url);

$mech->success() or die "Can't fetch the Requested page";

print "OK! \n"; #This works

$mech->submit_form(
   form_number => 1,
);
die "Submit failed" unless $mech->success;

$mech->save_content('out.txt');
mctylr
I also tried this, but the saved content is the html codes,, not the content of the downloaded file i wanted. hmm..
Suezy
You have **read** the saved contents, right? Two possibilities are a) there is a `meta` `refresh` tag or http `redirect` in the http headers, or b) an error message which can help you focus on correcting your submit request (via `click`).
mctylr
+1  A: 

Open the file (not 'Downloads' window) as if you were viewing it within your browser; you can save it later with a few lines of code.

Provided you have HTML::TreeBuilder installed:

my $textFile = $mech->content(format => "text");

should get you the text of the resulting window that opens.

Then open a filehandle to write your results in:

open my $fileHandle, ">", "results.txt";
print $fileHandle $textFile;
close $fileHandle;
Zaid
s/he said it was downloading a .txt file already (when done through the browser), so that isn't going to help.
ysth
@ysth: s/he said the file can be opened if need be. I'm assuming that the text file will open in a browser window.
Zaid
The message will open as a .txt file. When I click the button(type=submit), a dialog box will appear for the msg, this can also be saved or can be set to default to open (just like downloading a file).
Suezy
In that case, save the file, then open it with Perl and do whatever you want to do with it:`open my $sms, "<", "message.txt";``# Do Something``close $sms;`
Zaid
+1  A: 

I do this all the time with LWP, but I'm sure it's equally possible with Mech

I think where you might be going wrong is using Mech to request the page that has the button on it when you actually want to request the content from the page that the button causes to be sent to the browser when clicked.

What you need to do is review the html source of the page with the button that initiates the download and see what the Action associated with the button is. Most likely it will be a POST with some hidden fields or a URL to do a GET.

The Target URL of the Click has the stuff you actually want to get, not the URL of the page with the button on it.

Auctionitis
That's how Mech is intended to be used; you use it to go to a web page and it automatically deals with the details of gathering fields and making the appropriate request when you tell it to navigate to a different page.
ysth
Understood - although I expressed it poorly perhaps; I was talking more to the "how you navigate" part and whether the navigation target was correct.
Auctionitis
+1  A: 

For problems like this, you often have to investigate the complete chain of events that the browser handles. Use an HTTP sniffer tool to see everything the browser is doing until it gets to the file file. You then have to do the same thing in Mech.

brian d foy