views:

244

answers:

4

Hey,

How can I do this in PHP? e.g.

bit.ly/f00b4r ==> http://www.google.com/search?q=cute+kittens

In Java, the solution is this:

You should issue a HEAD request to the url using a HttpWebRequest instance. In the returned HttpWebResponse, check the ResponseUri.

Just make sure the AllowAutoRedirect is set to true on the HttpWebRequest instance (it is true by default). (Thx, casperOne)

And the code is

private static string GetRealUrl(string url)
{
    WebRequest request = WebRequest.Create(url);
    request.Method = WebRequestMethods.Http.Head;
    WebResponse response = request.GetResponse();
    return response.ResponseUri.ToString();
}

(Thx, Fredrik Mork)

But I want to do it in PHP. HOWTO? :)

A: 

CREDIT GOES TO http://forums.devshed.com/php-development-5/curl-get-final-url-after-inital-url-redirects-544144.html

function get_web_page( $url ) 
{ 
    $options = array( 
        CURLOPT_RETURNTRANSFER => true,     // return web page 
        CURLOPT_HEADER         => true,    // return headers 
        CURLOPT_FOLLOWLOCATION => true,     // follow redirects 
        CURLOPT_ENCODING       => "",       // handle all encodings 
        CURLOPT_USERAGENT      => "spider", // who am i 
        CURLOPT_AUTOREFERER    => true,     // set referer on redirect 
        CURLOPT_CONNECTTIMEOUT => 120,      // timeout on connect 
        CURLOPT_TIMEOUT        => 120,      // timeout on response 
        CURLOPT_MAXREDIRS      => 10,       // stop after 10 redirects 
    ); 

    $ch      = curl_init( $url ); 
    curl_setopt_array( $ch, $options ); 
    $content = curl_exec( $ch ); 
    $err     = curl_errno( $ch ); 
    $errmsg  = curl_error( $ch ); 
    $header  = curl_getinfo( $ch ); 
    curl_close( $ch ); 

    //$header['errno']   = $err; 
   // $header['errmsg']  = $errmsg; 
    //$header['content'] = $content; 
    print($header[0]); 
    return $header; 
}  
$thisurl = "http://www.example.com/redirectfrom";
$myUrlInfo = get_web_page( $thisurl ); 
echo $myUrlInfo["url"];
Zachary Burt
+1  A: 
<?php
$url = 'http://www.example.com';

print_r(get_headers($url));

print_r(get_headers($url, 1));
?>
Robert French
Parsing the Location header would probably work ; but what if there is tow (or more) levels of redirections ? (yeah, not what you generally see, but what if the destination site sets up some redirections the day they release a new version of the site ? )
Pascal MARTIN
+1  A: 

Did you read the bit.ly API? specifically here ?

I can't see the issue. Are you talking about possible redirects ?

Gabriel Sosa
If using a specific API, you will have to create a new specific code for each distinct shortening-URL service ; considering there are quite a bit of those, you will never stop coding and testing... some "generic" solution that work with any service would probably be easier, at least as a long term solution...
Pascal MARTIN
right! that's is why I was asking :P
Gabriel Sosa
seriously... I hate the vote down without any comment
Gabriel Sosa
+1  A: 

The time to try, you already found the answer.

Still, I would have gone with something like this :

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://bit.ly/tqdUj");
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_exec($ch);

$url = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);

curl_close($ch);

var_dump($url);

Some explanations :

  • the requested URL is the short one
  • you don't want the headers
  • you want to make sure the body is not displayed -- probably useless
  • you do not want the body ; ie, you want a HEAD request, and not GET
  • you want locations to be followed, of course
  • once the request has been executed, you want to get the "real" URL that has been fetched

And, here, you get :

string 'http://wordpress.org/extend/plugins/wp-pubsubhubbub/' (length=52)

(Comes from one of the last tweets I saw that contained a short URL)


This should work with any shortening-URL service, independantly of their specific API.

You might also want to tweak some other options, like timeouts ; see curl_setopt for more informations.

Pascal MARTIN