tags:

views:

379

answers:

1

Hello again. All of my application is written in PHP, bar 1 script which happens to create a md5 hash which is used later via PHP scripts. Problem being they dont match up.

PERL:

#$linkTrue = 'http://www.themobilemakeover.co.uk/mobile-makeover-appointment-booking-signup.php'
md5_hex($linkTrue);

And for testing purposes i did this in PHP:

echo md5("http://www.themobilemakeover.co.uk/mobile-makeover-appointment-booking-signup.php");

both return different values. Does anyone know why this is?

EDIT: WHOLE PHP SCRIPT

<?php

echo md5("http://www.themobilemakeover.co.uk/mobile-makeover-appointment-booking-signup.php");

?>

WHOLE PERL SCRIPT (sorry its long)

#!/usr/bin/perl
use strict;
use warnings;
use LWP::Simple;
require LWP::UserAgent;
sub trim($);
use DBI;
use Net::FTP;
use Digest::MD5 qw(md5 md5_hex md5_base64);

print "Content-type: text/html\n\n";
print "<html>\n<head>\n</head><body>\n";

my $ua = LWP::UserAgent->new;
$ua->timeout(10);
$ua->env_proxy;
$ua->max_redirect(0);

#my %get = ();
#for (split /\&/, $ENV{'QUERY_STRING'}) { my ($key, $val) = split /=/; $val =~ s/\+/ /g; $val =~ s/%([0-9a-fA-F]{2})/chr(hex($1))/ge; $get{$key} = $val; }
#my %post = ();
#for (split /\&/, <STDIN>) { my ($key, $val) = split /=/; $val =~ s/\+/ /g; $val =~ s/%([0-9a-fA-F]{2})/chr(hex($1))/ge; $post{$key} = $val; }
my %get = ('findAllPages' => 'true' );
my %post = ('ki' => '############################' );


sub trim($){
   my $string = shift;
   $string =~ s/^\s+//;
   $string =~ s/\s+$//;
   return $string;
}
sub extention {
   my($data) = @_;
   if( substr( trim($data), -1) eq "/" ){
      my @extArray = ('.html', '.php', '.htm', '.asp', '.shtml', '.aspx');
      foreach(@extArray){
         my $ext = $_;
         my $testResponse = $ua->get('http://' . trim($data . "index" . $ext));
         my $testResponseCode = $testResponse->code;
         if( $testResponseCode == 200 || $testResponseCode == 301 || $testResponseCode == 302 ){
            return trim($data . "index" . $ext);
            last;
         }
      }
   }else{
      return $data;
   }
}
if( defined( $get{findAllPages} ) && defined( $post{ki} ) ){
   my ($database, $hostname, $port, $password, $user );
   $database = "##########";
   $hostname = "############";
   $password = "##########";
   $user = "#########";
   my $KI = $post{ki};
   # connect to the database
   my $dsn = "DBI:mysql:database=$database;host=$hostname;";
   my $dbh = DBI->connect($dsn, $user, $password);
   my $sth = $dbh->prepare("SELECT * FROM accounts WHERE KI = '$KI' ") or die "Could not select from table" . $DBI::errstr;
   $sth->execute(); 
   if( $sth->rows != 0 ) {
      my $ref = $sth->fetchrow_hashref();
      my $domain = $ref->{website};
      my $DB_username = $ref->{db_name};
      my $DB_password = $ref->{db_pass};
      my $DB_ftpuser = $ref->{ftpuser};
      my $DB_ftppass = $ref->{ftppass};
      my $DB_ftpserver = $ref->{ftpserver};
      $sth->finish();
      $dbh->disconnect();

      chomp(my $url = trim($domain));
      # try and find  full path
      sub findFullPath {

         my($link, $landingPage) = @_;

         # strip ./ and / from beggining of string
         $link =~ s/^(?:(?:\/)|(?:\.\/))//g;

         # find out whether link is backtracing to previous folder
         if( $link =~ m/^\.\.\// ) { # link desination is back tracing

            if( $landingPage =~ m/(?:(?:\.html)|(?:\.php)|(?:\.htm)|(?:\.asp)|(?:\.shtml)|(?:\.aspx))$/g ) {
               # find destination folder from landing page
               my @folders = split( "/", $landingPage );    
               #find size of array
               my $foldersSize = scalar @folders;
               delete $folders[$foldersSize - 1];
               $foldersSize = scalar @folders;
               my @backFolders = ( $link =~ m/\.\.\//g ); # get rid of ../
               my $amountOfBackFolders = scalar @backFolders; # find how many folders back
               for( my $x=0; $x < $amountOfBackFolders; $x++ ) {
                  my $numberToDelete = ($foldersSize - 1) - $x;
                  delete $folders[$numberToDelete];
               }
               $landingPage = join( "/", @folders );
               $link =~ s/\.\.\///g;
               return $landingPage . "/" . $link . "\n";
            } elsif( $landingPage =~ m/(?:\/)$/g ) {
               my @folders = split( "/", $landingPage );    
               #find size of array
               my $foldersSize = scalar @folders;
               delete $folders[$foldersSize - 1];
               $foldersSize = scalar @folders;
               my @backFolders = ( $link =~ m/\.\.\//g ); # get rid of ../
               my $amountOfBackFolders = scalar @backFolders; # find how many folders back
               for( my $x=0; $x < $amountOfBackFolders; $x++ ) {
                  my $numberToDelete = ($foldersSize) - $x;
                  delete $folders[$numberToDelete];
               }
               $landingPage = join( "/", @folders );
               $link =~ s/\.\.\///g;
               return $landingPage . "/" . $link . "\n";
            } else {

            }

         }else{
            if( substr( $landingPage, -1) eq "/" ){
               return $landingPage . $link;
            }else{
               my @splitLandingPage = split( "/", $landingPage );
               my $amountSplit = scalar @splitLandingPage;
               my $toDelete = $amountSplit - 1;
               my $lastEntry = $splitLandingPage[$toDelete];
               if( $lastEntry =~ m/(?:(?:com)|(?:co\.uk)|(?:net)|(?:org)|(?:cc)|(?:tv)|(?:info)|(?:org\.uk)|(?:me\.uk)|(?:biz)|(?:name)|(?:eu)|(?:uk\.com)|(?:eu\.com)|(?:gb\.com)|(?:gb\.net)|(?:uk\.net)|(?:me)|(?:mobi))$/g ) {
                  return join( "/", @splitLandingPage ) . "/" . $link . "\n";
               }else{
                  delete $splitLandingPage[$toDelete];
                  return join( "/", @splitLandingPage ) . "/" . $link . "\n";
               }
            }
         }
      }

      # get HTTP details
      my $response = $ua->get('http://' . trim($url));
      my $responseCode = $response->code;
      my $responseLocation = $response->header( 'Location' );

      # contintue only if status code is 200 or 301
      if( $responseCode != 200 && $responseCode != 301 && $responseCode != 302 ){
          print "<span class=\"red\"> error: http://" . trim($url) . "Domain name invalid, please use differnet domain name: http status - " . $responseCode . "</span><br />\n";
          die;
      }

      # change url if domain status eq 301
      if( $responseCode == 301 || $responseCode == 302 ){
         if($response->header( 'Location' ) =~ m/^http:\/\/www\./g ) {
            $url = substr( $response->header( 'Location' ), 11 );
         }elsif($response->header( 'Location' ) =~ m/^http:\/\//g ) {
            $url = substr( $response->header( 'Location' ), 7 );
         }else{
            $url = findFullPath($response->header( 'Location' ), $url);
         }
      }

      my @pagesArray = ($url);
      my @pagesScannedArray;
      my @mainPagesArray;
      my @pagesNotScanned;
      my $z = 0;

      #print "\nGethering all valid links from " . $domain . "...\n\n";

      while ( @pagesArray && $z < 100 ) {
         # get the next in queue for proccessing
         my $page = trim(shift @pagesArray);
         if( ! grep {$_ eq trim($page)} @pagesNotScanned ) {
            # check page http status
            $response = $ua->get("http://" . trim($page));
            $responseCode = $response->code;
            if( $responseCode == 200 || $responseCode == 301 || $responseCode == 302 ){
               # change page url if 301 redirect
               if( $responseCode == 301 || $responseCode == 302 ){
                  if($response->header( 'Location' ) =~ m/^http:\/\/www\./g ) {
                     $page = substr( $response->header( 'Location' ), 11 );
                  }elsif($response->header( 'Location' ) =~ m/^http:\/\//g ) {
                     $page = substr( $response->header( 'Location' ), 7 );
                  }else{
                     $page = findFullPath($response->header( 'Location' ), $url);
                  }
               }
               # connect to page and get contents
               if( my $pageData = get "http://" . trim($page) ) {
                  # get all links on page
                  my @pageLinksArray = ( $pageData =~ m/href=["']([^"']*)["']/g );
                  # foreach link on the page
                  foreach( @pageLinksArray ) {
                      my $link = trim($_);
                     # remove url if located on same domain
                     $link =~ s/(?:http:\/\/)?(?:www\.)?$url//g;
                     # if link is format we are looking for
                     if( $link =~ m/(?:(?:\.html)|(?:\.php)|(?:\.htm)|(?:\.asp)|(?:\.shtml)|(?:\.aspx)|(?:\/))$/ ) {
                        # if link is outbound
                        if( $link =~ m/^http:\/\//g ) {
                           if( ! grep {$_ eq trim($link)} @pagesNotScanned ) {
                              if( ! grep {$_ eq trim($page)} @mainPagesArray ) {
                                 push ( @pagesNotScanned, trim($link) );
                              }
                           }
                        }else{
                           # find full path for link
                           my $newUrl = &findFullPath(trim($link), trim($page));
                           # if link has not already been claimed to be a main page
                           if( ! grep {$_ eq trim($newUrl)} @mainPagesArray ) {
                              # if link is not already in queue
                              if( ! grep {$_ eq trim($newUrl)} @pagesArray ) {
                                 push ( @pagesArray, trim($newUrl) );
                              }
                           }
                        }
                     }
                  }
                  if( ! grep {$_ eq trim($page)} @mainPagesArray ) {
                     push ( @mainPagesArray, trim($page) );
                  }
               }
            }else{
               if( ! grep {$_ eq trim($page)} @pagesNotScanned ) {
                  if( ! grep {$_ eq trim($page)} @mainPagesArray ) {
                     push ( @pagesNotScanned, trim($page) );
                  }
               }
            }
         }
         $z++;
      }

      if( scalar @mainPagesArray != 0 ) {
         my ($database, $hostname, $port, $password, $user );
         $database = $DB_username;
         $hostname = "###########";
         $password = $DB_password;
         $user = $DB_username;

         # connect to the database
         my $dsn = "DBI:mysql:database=$database;host=$hostname;";
         my $dbh = DBI->connect($dsn, $user, $password) or die " error: Couldn't connect to database: " . DBI->errstr;

        print "\nTesting links' extentions from " . $domain . "...\n\n";

        my $root;
        my $ftp = Net::FTP->new($DB_ftpserver, Debug => 0) or die "Cannot connect to some.host.name: $@";
        $ftp->login($DB_ftpuser, $DB_ftppass) or die "Cannot login ", $ftp->message;
        my @list = $ftp->dir;
        if( scalar @list != 0 ) {
            foreach( @list ){
                if( $_ =~ m/((?:www)|(?:public_html)|(?:htdocs))$/g ){
                    $root = $1;
                    last;
                }
            }
        }
        if( $root eq "" ) {
            print "error: could not identify root directory.<br />\n";
            die;
        }

        foreach( @mainPagesArray ) {
            my $webpage = &extention(trim($_));
            if( trim($webpage) ne trim($domain) ){
                my $webpageQuote = $dbh->quote("http://www." . $webpage);
                my $sth = $dbh->prepare("SELECT * FROM page_names WHERE linkTrue = $webpageQuote ") or die "Could not select from table" . $DBI::errstr;
                $sth->execute(); 
                if( $sth->rows == 0 ) {
                    print "http://www." . $webpage . "<br />\n";
                    my $linkTrue = $dbh->quote("http://www." . $webpage);
                    my $string = ($webpage =~ s/^$domain//g);
                    my $linkFromRoot = $dbh->quote($root . $webpage);
                    my $page_name = $dbh->quote("");
                    my $table_name = $dbh->quote(md5_hex(trim($linkTrue)));
                    my $navigation = $dbh->quote("");
                    my $location = $dbh->quote("");
                    $dbh->do("INSERT INTO page_names (linkFromRoot, linkTrue, page_name, table_name, navigation, location) VALUES ( $linkFromRoot, $linkTrue, $page_name, $table_name, $navigation, $location )") or die " error: Couldn't connect to database: " . DBI->errstr;
                }
            }
         }
      }else{
         print "<span class=\"red\"> error: No pages where found. This CMS is designed for pre-existing sites. Please contact support for more information.</span><br />\n";
      }
   }else{
      print "<span class=\"red\"> error: input key incorrerct.</span><br />\n";
   }
}else{
   print "<span class=\"red\"> error: This area is forbidden please locate back to www.plugnplaycms.co.uk</span><br />\n";
}

print "</body>\n</html>";

I believe its on line 274. The code might be messy but its my first script with perl, only been at it a week.

thing i got it. $dbh->quote() adds single quotes around the value.

http://www.themobilemakeover.co.uk/index.php
HEX: 58030da397e8a071bc192e67744faeb3 VALUE: ['http://www.themobilemakeover.co.uk/index.php'] http://www.themobilemakeover.co.uk/about-us-the-mobile-makeover.php
HEX: 569c081a2974da39758a3cbf3c3407d2 VALUE: ['http://www.themobilemakeover.co.uk/about-us-the-mobile-makeover.php'] http://www.themobilemakeover.co.uk/beauty-products-used.php
HEX: ac94f84cf6b27bca0c23cd6b0e0f1fc9 VALUE: ['http://www.themobilemakeover.co.uk/beauty-products-used.php'] http://www.themobilemakeover.co.uk/beauty-treatments.php
HEX: e88d7e8e16ffc0a72b56a884d4c6c06b VALUE: ['http://www.themobilemakeover.co.uk/beauty-treatments.php'] http://www.themobilemakeover.co.uk/contact.php
HEX: 8924fa24bdde1c4e072f99826d957b77 VALUE: ['http://www.themobilemakeover.co.uk/contact.php'] http://www.themobilemakeover.co.uk/pamper-parties.php
HEX: 1f2fae70048359734a9d1b3ca29cce55 VALUE: ['http://www.themobilemakeover.co.uk/pamper-parties.php'] http://www.themobilemakeover.co.uk/mobile-makeover-appointment-booking.php
HEX: 9961f75109590c3924e4018768ecd44e VALUE: ['http://www.themobilemakeover.co.uk/mobile-makeover-appointment-booking.php'] http://www.themobilemakeover.co.uk/sitemap/index.php
HEX: fbca4996156b038f4635467ee13e1615 VALUE: ['http://www.themobilemakeover.co.uk/sitemap/index.php'] http://www.themobilemakeover.co.uk/accessibility/index.php
HEX: 6f03046cbe90c490e4993c5325a44aa7 VALUE: ['http://www.themobilemakeover.co.uk/accessibility/index.php'] http://www.themobilemakeover.co.uk/terms/index.php
HEX: 5304b5e9bd933fb920a4f8749c27094b VALUE: ['http://www.themobilemakeover.co.uk/terms/index.php'] http://www.themobilemakeover.co.uk/beauty-treatments2.php
HEX: 96225fa657ef60b4969d277d01d8b577 VALUE: ['http://www.themobilemakeover.co.uk/beauty-treatments2.php'] http://www.themobilemakeover.co.uk/beauty-treatments3.php
HEX: 327c1bc37354aad202c90efe0dfa756b VALUE: ['http://www.themobilemakeover.co.uk/beauty-treatments3.php'] http://www.themobilemakeover.co.uk/wedding-and-special-occasions.php
HEX: 54c074a1881a0c958c7c2b8ff88f63d6 VALUE: ['http://www.themobilemakeover.co.uk/wedding-and-special-occasions.php'] http://www.themobilemakeover.co.uk/mobile-makeover-appointment-booking-signup.php
HEX: 486c944b10ef539aa7ba4bfe607861f2 VALUE: ['http://www.themobilemakeover.co.uk/mobile-makeover-appointment-booking-signup.php']

+10  A: 

When I try it, both programs return a4cbeef10b3c6d44ca30d96370619eef

I have the feeling you're not giving us the whole picture. Show us the code leading up to this. In particular, check for newlines. Have you used chomp in the perl script?

Try for yourself. Here is the complete php script I used:

<?php
echo md5("http://www.themobilemakeover.co.uk/mobile-makeover-appointment-booking-signup.php");
?>

And here is a complete perl script I used:

#!/usr/bin/perl
use Digest::Perl::MD5 'md5_hex';

$linkTrue = 'http://www.themobilemakeover.co.uk/mobile-makeover-appointment-booking-signup.php';
print md5_hex($linkTrue);

edit:

Which if the two scripts is not returning that value for md5? That's the one that has a bug. Log the value that you're passing to md5, (with '[' before and ']' after to detect extra whitespace). Does that value match what you expect?

edit 2:

It looks like you found it, right? It's the single quotes. This:

print md5_hex("'http://www.themobilemakeover.co.uk/mobile-makeover-appointment-booking-signup.php'");

Notice the extra quotes. The above line gives me: 486c944b10ef539aa7ba4bfe607861f2

amarillion
Ditto here. Make sure you don't have any extra whitespace on the ends of both the strings.
jamieb
im not sure what your saying ( im sure its the perl output which is incorrect )
Phil Jackson
look at EDIT question again.
Phil Jackson
And look at my 2nd edit...
amarillion