I'm doing a script that enters a page and extract and extract information from it. The script I'm doing it in Perl.
Problem: Not how to start running the script because when I start it picks up the url like this and this is not what I want
<a href="http://valeptr.com/scripts/runner.php?BA=6672&amp;hash=08c5c66839a468a11b7574e6ce02e0&amp;url=http%3A%2F%2Fdizzydollarsgpt.com%2Fmembers%2Fregister.php%3Fref%3Dthomasd24" target="_blank"><img alt="DizzyDollarsGPT" border="0" src="enter.php_files/runner.jpeg" /></a>
And I want get this:
<a href="http://valeptr.com/scripts/runner.php?PA=33425"
target="_ptc" onclick="javascript:reloadpage(11)">
<img src="1appsearch.php_files/runner_007.gif"
alt="Xray-cash" border="0">
The all of code is here:
#!/usr/bin/perl
#=======================================================================
#
# FILE: ValePTR.pl
#
# USAGE: ./ValePTR.pl user password
#
# DESCRIPTION:
#
# OPTIONS: ---
# REQUIREMENTS: libgetopt-declare-perl
# BUGS: ---
# NOTAS: ---
# AUTOR: Alejandro
# VERSION: 1.0
# CREATED: Lunes 5 de julio del 2010
# REVISION: 1
#=======================================================================
use warnings;
use strict;
use HTML::TreeBuilder;
use WWW::Mechanize;
use Getopt::Long;
my($content, $search_result, @search_results);
#Constructor del explorador con un UserAgent falso.
my $Explorador = WWW::Mechanize->new( agent => 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030624' );
$Explorador->get("file://home/alejandro/enter.php.html"); #Se procede a acceder a la dirección url para hacer el HTTP Post
#$Explorador->field('username','miuser'); # Busca el campo username y pone el usuario
#$Explorador->field('password','mipass'); # Busca el campo password y pone la contraseña
#$Explorador->submit(); # Hace el HTTP POST
#print $Explorador->content();
#parse $content with treebuilder
my $page = HTML::TreeBuilder->new();
$page->parse($Explorador->content());
$page->eof();
@search_results= $page->look_down(
sub{ $_[0]-> tag() eq 'a' and ($_[0]->attr('href'))}
);
foreach $search_result (@search_results){
my($url, $title, $summary);
$title = $page->look_down(
sub{ $_[0]-> tag() eq 'a' and ($_[0]->attr('href'))}
);
if($title)
{
print 'title: '.$title->as_HTML,"\n";
}
}
$page->delete;
The all of HTML code is here: http://gist.github.com/465568
PD:Please help me I've been here like 3 hours without success
Definitively what happens is that to take everything what there is one
http://valeptr.com/scripts/runner.php?BA=
and what I want to take is :
http://valeptr.com/scripts/runner.php?PA=