Because the Twisted getPage
function doesn't give me access to headers, I had to write my own getPageWithHeaders
function.
def getPageWithHeaders(contextFactory=None, *args, **kwargs):
try:
return _makeGetterFactory(url, HTTPClientFactory,
contextFactory=contextFactory,
*args, **kwargs)
except:
traceback.print_exc()
This is exactly the same as the normal getPage
function, except that I added the try/except block and return the factory object instead of returning the factory.deferred
For some reason, I sometimes get a maximum recursion depth exceeded error here. It happens consistently a few times out of 700, usually on different sites each time. Can anyone shed any light on this? I'm not clear why or how this could be happening, and the Twisted codebase is large enough that I don't even know where to look.
EDIT: Here's the traceback I get, which seems bizarrely incomplete:
Traceback (most recent call last):
File "C:\keep-alive\utility\background.py", line 70, in getPageWithHeaders
factory = _makeGetterFactory(url, HTTPClientFactory, timeout=60 , contextFactory=context, *args, **kwargs)
File "c:\Python26\lib\site-packages\twisted\web\client.py", line 449, in _makeGetterFactory
factory = factoryFactory(url, *args, **kwargs)
File "c:\Python26\lib\site-packages\twisted\web\client.py", line 248, in __init__
self.headers = InsensitiveDict(headers)
RuntimeError: maximum recursion depth exceeded
This is the entire traceback, which clearly isn't long enough to have exceeded our max recursion depth. Is there something else I need to do in order to get the full stack? I've never had this problem before; typically when I do something like
def f(): return f()
try: f()
except: traceback.print_exc()
then I get the kind of "maximum recursion depth exceeded" stack that you'd expect, with a ton of references to f()