views:

905

answers:

6

If I have a URL (eg. http://www.foo.com/alink.pl?page=2), I want to determine if I am being redirected to another link. I'd also like to know the final URL (eg. http://www.foo.com/other_link.pl). Finally, I want to be able to do this in Perl and Groovy.

+9  A: 

Well, I know nothing about either Perl or groovy, so I'll give you an another from an HTTP point of view, and you'll have to adapt.

Normally, you make an HTTP request, and you get back some HTML text along with a response code. The response code for Success is 200. Any response code in the 300 range is some form of a redirect.

James Curran
+1  A: 

In Perl you can use LWP::Useragent for that. I guess the easiest way is to add a response_redirect handler using add_handler.

Leon Timmermans
+5  A: 

Referring to James's answer - sample HTTP session:

$ telnet www.google.com 80
HEAD / HTTP/1.1
HOST: www.google.com


HTTP/1.1 302 Found
Location: http://www.google.it/
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Set-Cookie: ##############################
Date: Thu, 30 Oct 2008 20:03:36 GMT
Server: ####
Content-Length: 218

Using HEAD instead of GET you get only the header. "302" means a temporary redirection, "Location:" is where you are redirected to.

Federico Ramponi
+10  A: 

In Perl:

use LWP::UserAgent;
my $ua = LWP::UserAgent->new;

my $request  = HTTP::Request->new( GET => 'http://google.com/' );
my $response = $ua->request($request);
if ( $response->is_success and $response->previous ) {
    print $request->url, " redirected to ", $response->request->uri, "\n";
}
Anirvan
s/GET/HEAD/. With google.com it doesn't seem any faster, but try it with microsoft.com...
Federico Ramponi
+3  A: 

A quick & dirty groovy script to show the concepts -- Note, this is using java.net.HttpURLConnection

In order to detect the redirect, you have to use setFollowRedirects(false). Otherwise, you end up on the redirected page anyway with a responseCode of 200. The downside is you then have to navigate the redirect yourself.

URL url = new URL ('http://google.com')
HttpURLConnection conn = url.openConnection()
conn.followRedirects = false
conn.requestMethod = 'HEAD'
println conn.responseCode
// Not ideal - should check response code too
if (conn.headerFields.'Location') {
  println conn.headerFields.'Location'
}

301
["http://www.google.com/"]
Ken Gentle
A: 

I noticed Anivran's response works very well, except with 301 redirects. I was wondering if there was anyway to detect this in perl? For example Anivran's script will not detect www.clearapp.com redirecting to oracle.com.

Bobby