views:

58

answers:

2

i have to parse a page in php,the url of the page is going on 302 Moved temporarily header and is moved to a not found page.Its data can be retrieved manually through console option of firebug add on of mozilla.But if i try to parse it using php it gives me that not found page in return.How can i parse that page please suggest??

edit: iam doing something like this to get the page's content

$file_results = @fopen("http://www.the url to be parses","rb");
    $parsed_results='';
    if($file_results)
    {
        while ($data3 = fread($file_results,"125000"))
        $parsed_results .= $data3;
    }
A: 

You need to read the header, see where it is redirecting you, and make another request to get the actual resource. Kind of a pain, but thats how the protocol works. Most browsers do this transparently.

CaptnCraig
i would really appreciate if you can quote some example
developer
what method did you use to make the request in the first place?
CaptnCraig
i have edited my question please see that
developer
I'm not sure the best way to do this in php, but that is the general idea of how you can get the actual resource.
CaptnCraig
okay :-( thanks anyways
developer
+1  A: 

You can use get_headers() to find all the headers while you're being redirected.

$url = 'http://google.com';
$headers = get_headers($url, 1);

print 'First step gave: ' . $headers[0] . '<br />';

// uncomment below to see the different redirection URLs
// print_r($headers['Location']);

// $headers['Location'] will contain either the redirect URL, or an array
// of redirection URLs
$first_redirect_url = isset($headers['Location'][0]) ?
    $headers['Location'][0] : $headers['Location'];

print "First redirection is to: {$first_redirect_url}<br />";

// assuming you have fopen wrappers enabled...
print file_get_contents($first_redirect_url);

And just keep looking till you get the resource you want?

Owen
can you please elaborate your answer a bit plzz...
developer
did you try the code?
Owen
ya but this is not what i want....see i want the content of the very first page not of the redirected page
developer