I am using urllib2 and HTTPCookieProcessor to login to a website. I want to login to multiple accounts concurrently and store the cookies to be reused later.
Can you recommend an approach or library to achieve this?
I am using urllib2 and HTTPCookieProcessor to login to a website. I want to login to multiple accounts concurrently and store the cookies to be reused later.
Can you recommend an approach or library to achieve this?
How to achieve this really depends on you needs: what kind of login is it? Digest authentication? Is it a web form? Is JavaScript involved (you're pretty much screwed if this is the case)? A library like mechanize can help you a lot with such stuff: handling of forms, redirection, authentication, cookies... However, you'd have to take care of concurrency yourself by spawning threads/processes.
Another approach that works beautifully for concurrency is using Twisted. With that solution however you'd have to handle redirection and cookies etc. yourself -- although you might be able to reuse parts of e.g. mechanize.
The OP clarified that this is not a concurrency issue. With sequential processing in mind, this is much simpler. I once used something like the following to update a bunch of SIP phone base stations (they had a web front-end which you could use to upload VCard files for the phone book). Note that I just cut away some crap and renamed this and that in this hacky script, I did not test it at all. Its sole purpose is to give the OP an idea on how he could deal with this.
#!/usr/bin/python
# -*- coding:utf-8 -*-
from optparse import OptionParser
import sys
from mechanize import Browser, CookieJar, Request, urlopen
accounts = [
{'ipaddr': '127.0.0.1', 'user': 'joe', 'pass': 'foobar'},
]
class WebsiteAccount(object):
def __init__(self, ipaddr, username, password, browser):
self.ipaddr = ipaddr
self.username = username
self.password = password
self.browser = browser
self.cookiejar = CookieJar()
self.browser.set_cookiejar(self.cookiejar)
def login(self):
self.browser.open('http://'+self.ipaddr+'/login.html')
self.browser.select_form(name='loginform')
self.browser.form.set_value(self.username, name='username')
self.browser.form.set_value(self.password, name='password')
resp = self.browser.submit()
print 'Logging into account %s@%s ...' % (self.username, self.ipaddr),
if resp.geturl().endswith('/login.html'):
print 'FAILED!'
sys.exit(1)
print ' OK'
def logout(self):
print ('Logging out from account %s@%s...' % (self.username, self.ipaddr),
self.browser.open('http://'+self.ipaddr+'/logout.html')
self.browser.close()
print 'OK'
def main():
parser = OptionParser()
parser.add_option('-d', '--debug', action='store_true', dest='debug', default=False)
parser.add_option('-v', '--verbose', action='store_true', dest='verbose', default=False)
(opts, args) = parser.parse_args()
for account in accounts:
browser = Browser()
browser.set_handle_referer(True)
browser.set_handle_redirect(True)
browser.set_handle_robots(False)
bs = WebsiteAccount(account['ipaddr'],
account['user'],
account['pass'],
browser)
# DEBUG
if opts.debug == True:
browser.set_debug_redirects(True)
browser.set_debug_responses(True)
browser.set_debug_http(True)
bs.login()
try:
# ... do some stuff
# save cookies here?
pass
finally:
# you shouldn't use this if you are interested in the login cookies
bs.logout()
if __name__=='__main__':
main()