views:

164

answers:

1

As far as I've been able to tell cookielib isnt thread safe; but then again the post stating so is five years old, so it might be wrong.

Nevertheless, I've been wondering - If I spawn a class like this:

class Acc:
    jar = cookielib.CookieJar()
    cookie = urllib2.HTTPCookieProcessor(jar)       
    opener = urllib2.build_opener(cookie)

    headers = {}
    def __init__ (self,login,password):
        self.user = login
        self.password = password

    def login(self):
        return False # Some magic, irrelevant

    def fetch(self,url):
        req = urllib2.Request(url,None,self.headers)
        res = self.opener.open(req)
        return res.read()

for each worker thread, would it work? (or is there a better approach?) Each thread would use it's own account; so the fact that workers wouldn't share their cookies is not a problem.

+1  A: 

You want to use pycurl (the python interface to libcurl). It's thread-safe, supports cookies, https, etc.. The interface is a bit strange, but it just takes a bit of getting used to.

I've only used pycurl w/ HTTPBasicAuth + SSL, but I did find an example using pycurl and cookies here. I believe you'll need to update the pycurl.COOKIEFILE (line 74) and pycurl.COOKIEJAR (line 82) to have some unique name (maybe keying off of id(self.crl)).

As I remember, you'll need to create a new pycurl.Curl() for each request to maintain thread safety.

sdolan