views:

39

answers:

3

i am using google app engine for fetching the feed url bur few of the urls are 301 redirect i want to get the final url which returns me the result

i am usign the universal feed reader for parsing the url is there any way or any function which can give me the final url.

+2  A: 

It is not possible to get the 'final' URL by parsing, in order to resolve it, you would need to at least perform an HTTP HEAD operation

Ofir
+1 for mentioning use of `HEAD`
Matt Joiner
A: 

You can do this by handling redirects manually. When calling fetch, pass in follow_redirects=False. If your response object's HTTP status is a redirect code, either 301 or 302, grab the Location response header and fetch again until the HTTP status is something else. Add a sanity check (perhaps 5 redirects max) to avoid redirect loops.

Drew Sears
+1  A: 

If you're using the urlfetch API, you can just access the final_url attribute of the response object you get from urlfetch.fetch(), assuming you set follow_redirects to True:

>>> from google.appengine.api import urlfetch
>>> url_that_redirects = 'http://www.example.com/redirect/'
>>> resp = urlfetch.fetch(url=url_that_redirects, follow_redirects=False)
>>> resp.status_code
302 # or 301 or whatever
>>> resp = urlfetch.fetch(url=url_that_redirects, follow_redirects=True)
>>> resp.status_code
200
>>> resp.final_url
'http://www.example.com/final_url/'

Note that the follow_redirects keyword argument defaults to True, so you don't have to set it explicitly.

Will McCutchen