views:

109

answers:

2

I am trying to use Perl's WWW::Mechanize to login to my bank and pull transaction information. After logging in through a browser to my bank (Wells Fargo), it briefly displays a temporary web page saying something along the lines of "please wait while we verify your identity". After a few seconds it proceeds to the bank's webpage where I can get my bank data. The only difference is that the URL contains several more "GET" parameters appended to the URL of the temporary page, which only had a sessionID parameter.

I was able to successfully get WWW::Mechanize to login from the login page, but it gets stuck on the temporary page. There is a <meta http-equiv="Refresh"... tag in the header, so I tried $mech->follow_meta_redirect but it didn't get me past that temporary page either.

Any help to get past this would be appreciated. Thanks in advance.

Here is the barebones code that gets me stuck at the temporary page:

#!/usr/bin/perl -w
use strict;
use WWW::Mechanize;

my $mech = WWW::Mechanize->new();
$mech->agent_alias( 'Linux Mozilla' );

$mech->get( "https://www.wellsfargo.com" );
$mech->submit_form(
    form_number => 2,
    fields => {
        userid => "$userid",
        password => "$password"
    },
    button => "btnSignon"
);
+2  A: 

You'll need to reverse-engineer what's happening on that intermediary page. Does it use Javascript to set some cookies, for example? Mech won't parse or execute Javascript on a page, so it may be trying to follow the meta-refresh but missing some crucial information about what needs to happen for the final request.

Try using a tool like Firebug to watch the request that's sent when the browser follows the meta-refresh. Examine all the request headers, including cookies, that are sent to request the final page. Then use Mech to duplicate that.

friedo
Thanks for the pointer to Firebug. I installed it and noted there is a `<body id="wf_wellsfargo_com" onload="getPrefs();addFlash()">` line in the intermediary page that refers to a javascript function. I'm guessing I'm out of luck since Mech can't deal with javascript at this point.
J Miller
You may not be out of luck, you just need to find out what those Javascript functions are doing and make Mech do the same thing. Use Firebug to watch the HTTP transaction -- are there any POST fields or cookies that you didn't see before? The JS probably added them, so add the same things with Mech.
friedo
+1  A: 

If you know the location of the next page you can try getting it after attaching the extra get parameters using

$mech->add_header($name => $value);
Narthring
That will add a request header, but not add fields to the request URI or POST content.
brian d foy