I am currently trying to log into a site using Python however the site seems to be sending a cookie and a redirect statement on the same page. Python seems to be following that redirect thus preventing me from reading the cookie send by the login page. How do I prevent Python's urllib (or urllib2) urlopen from following the redirect?
urllib2.urlopen
calls build_opener()
which uses this list of handler classes:
handlers = [ProxyHandler, UnknownHandler, HTTPHandler,
HTTPDefaultErrorHandler, HTTPRedirectHandler,
FTPHandler, FileHandler, HTTPErrorProcessor]
You could try calling urllib2.build_opener(handlers)
yourself with a list that omits HTTPRedirectHandler
, then call the open()
method on the result to open your URL. If you really dislike redirects, you could even call urllib2.install_opener(opener)
to your own non-redirecting opener.
It sounds like your real problem is that urllib2
isn't doing cookies the way you'd like. See also How to use Python to login to a webpage and retrieve cookies for later usage?
This question was asked before here.
EDIT: If you have to deal with quirky web applications you should probably try out mechanize. It's a great library that simulates a web browser. You can control redirecting, cookies, page refreshes... If the website doesn't rely [heavily] on JavaScript, you'll get along very nicely with mechanize.
You could do a couple of things:
- Build your own HTTPRedirectHandler that intercepts each redirect
- Create an instance of HTTPCookieProcessor and install that opener so that you have access to the cookiejar.
This is a quick little thing that shows both
import urllib2
redirect_handler = urllib2.HTTPRedirectHandler()
class MyHTTPRedirectHandler(urllib2.HTTPRedirectHandler):
def http_error_302(self, req, fp, code, msg, headers):
print "Cookie Manip Right Here"
return urllib2.HTTPRedirectHandler.http_error_302(self, req, fp, code, msg, headers)
http_error_301 = http_error_303 = http_error_307 = http_error_302
cookieprocessor = urllib2.HTTPCookieProcessor()
opener = urllib2.build_opener(MyHTTPRedirectHandler, cookieprocessor)
urllib2.install_opener(opener)
response =urllib2.urlopen("WHEREEVER")
print response.read()
print cookieprocessor.cookiejar