If I have a URL (eg. http://www.foo.com/alink.pl?page=2), I want to determine if I am being redirected to another link. I'd also like to know the final URL (eg. http://www.foo.com/other_link.pl). Finally, I want to be able to do this in Perl and Groovy.
Well, I know nothing about either Perl or groovy, so I'll give you an another from an HTTP point of view, and you'll have to adapt.
Normally, you make an HTTP request, and you get back some HTML text along with a response code. The response code for Success is 200. Any response code in the 300 range is some form of a redirect.
In Perl you can use LWP::Useragent for that. I guess the easiest way is to add a response_redirect
handler using add_handler
.
Referring to James's answer - sample HTTP session:
$ telnet www.google.com 80
HEAD / HTTP/1.1
HOST: www.google.com
HTTP/1.1 302 Found
Location: http://www.google.it/
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Set-Cookie: ##############################
Date: Thu, 30 Oct 2008 20:03:36 GMT
Server: ####
Content-Length: 218
Using HEAD instead of GET you get only the header. "302" means a temporary redirection, "Location:" is where you are redirected to.
In Perl:
use LWP::UserAgent;
my $ua = LWP::UserAgent->new;
my $request = HTTP::Request->new( GET => 'http://google.com/' );
my $response = $ua->request($request);
if ( $response->is_success and $response->previous ) {
print $request->url, " redirected to ", $response->request->uri, "\n";
}
A quick & dirty groovy script to show the concepts -- Note, this is using java.net.HttpURLConnection
In order to detect the redirect, you have to use setFollowRedirects(false)
. Otherwise, you end up on the redirected page anyway with a responseCode
of 200. The downside is you then have to navigate the redirect yourself.
URL url = new URL ('http://google.com')
HttpURLConnection conn = url.openConnection()
conn.followRedirects = false
conn.requestMethod = 'HEAD'
println conn.responseCode
// Not ideal - should check response code too
if (conn.headerFields.'Location') {
println conn.headerFields.'Location'
}
301
["http://www.google.com/"]
I noticed Anivran's response works very well, except with 301 redirects. I was wondering if there was anyway to detect this in perl? For example Anivran's script will not detect www.clearapp.com redirecting to oracle.com.