views:

50

answers:

2

Hi,

I've been having some trouble getting images to download when logged into a website that requires you to be logged in. The images can only be viewed when you are logged in to the site, but you cannot seem to view them directly in the browser if you copy its location into a tab/new window (it redirects to an error page - so I guess the containing folder has be .htaccess-ed).

Anyway, the code I have below allows me to log in and grab the HTML content, which works well - but I cannot grab the images ... this is where I need help!

<?
// curl.php

class Curl { 

    public $cookieJar = ""; 

    public function __construct($cookieJarFile = 'cookies.txt') { 
        $this->cookieJar = $cookieJarFile; 
    } 

    function setup() { 
        $header = array(); 
        $header[0]  = "Accept: text/xml,application/xml,application/xhtml+xml,"; 
        $header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/gif;q=0.8,image/x-bitmap;q=0.8,image/jpeg;q=0.8,image/png,*/*;q=0.5"; 
        $header[]   = "Cache-Control: max-age=0"; 
        $header[]   = "Connection: keep-alive"; 
        $header[]   = "Keep-Alive: 300"; 
        $header[]   = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7"; 
        $header[]   = "Accept-Language: en-us,en;q=0.5"; 
        $header[]   = "Pragma: "; // browsers keep this blank. 

        curl_setopt($this->curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7'); 
        curl_setopt($this->curl, CURLOPT_HTTPHEADER, $header); 
        curl_setopt($this->curl, CURLOPT_COOKIEJAR, $this->cookieJar); 
        curl_setopt($this->curl, CURLOPT_COOKIEFILE, $this->cookieJar); 
        curl_setopt($this->curl, CURLOPT_AUTOREFERER, true); 
        curl_setopt($this->curl, CURLOPT_FOLLOWLOCATION, true); 
        curl_setopt($this->curl, CURLOPT_RETURNTRANSFER, true); 
    } 

    function get($url) { 
        $this->curl = curl_init($url); 
        $this->setup(); 

        return $this->request(); 
    }

    function getAll($reg, $str) { 
        preg_match_all($reg, $str, $matches); 
        return $matches[1]; 
    } 

    function postForm($url, $fields, $referer = '') { 
        $this->curl = curl_init($url); 
        $this->setup(); 
        curl_setopt($this->curl, CURLOPT_URL, $url); 
        curl_setopt($this->curl, CURLOPT_POST, 1); 
        curl_setopt($this->curl, CURLOPT_REFERER, $referer); 
        curl_setopt($this->curl, CURLOPT_POSTFIELDS, $fields); 
        return $this->request(); 
    } 

    function getInfo($info) { 
        $info = ($info == 'lasturl') ? curl_getinfo($this->curl, CURLINFO_EFFECTIVE_URL) : curl_getinfo($this->curl, $info); 
        return $info; 
    } 

    function request() { 
        return curl_exec($this->curl); 
    } 
} 

?>

And below is the page that uses it.

<?
// data.php

include('curl.php'); 
$curl = new Curl(); 

$url = "http://domain.com/login.php"; 
$newURL = "http://domain.com/go_here.php"; 

$username = "user";
$password = "pass";

$fields = "user=$username&pass=$password"; 

// Calling URL 
$referer = "http://domain.com/refering_page.php"; 

$html = $curl->postForm($url, $fields, $referer); 

$html = $curl->get($newURL);
echo $html;

?>

I've tried putting the direct URL for the image into $newURL but that doesn't get the image - it simply returns an error saying since that folder is not available to view directly. I've tried varying the above in different ways, but I haven't been successful in getting an image, though I have managed to get a screen through basically saying error 405 and/or 406 (but not the image I want).

Any help would be great!

A: 

Wow,

Seems like convoluted issue.

What I would do is compare a browser session with your PHP code at the HTTP layer and see what's different.

Grab Wireshark, connect using your browser successfully. You will need to filter out all other traffic and only dump what's on port 80. If you right click on a packet and click "follow TCP stream" it'll give you the HTTP headers and the output of the page.

Then do the same but this time with the PHP script.

Then compare the headers and see what's different. Maybe you're missing one or two headers, maybe you need to go to a page first, maybe your PHP script isn't sending the right cookies.

Cetra
A: 

From the site's behavior it seems to me that it is not a session (cookie) problem, otherwise opening another tab would allow you to download the images.

Check the http referrer, it is the first suspect on my list.

Iacopo