views:

390

answers:

3

I am using libcurl to DL a webpage, then i am scanning it for data and doing something with one of the links. However, once in a while the page is different then i except thus i extract bad data and pycurl throws an exception. I tried finding the exception name for pycurl but had no luck.

Is there a way i can get the traceback to execute a function so i can dump the file so i can look at the file input and see were my code went wrong?

+1  A: 

Can you catch all exceptions somewhere in the main block and use sys.exc_info() for callback information and log that to your file. exc_info() returns not just exception type, but also call traceback so there should information what went wrong.

Jiri
+2  A: 

sys.excepthook may help you here, where you can set a global exception handler. I am not sure how pycurl exceptions are handled, it being a binding library, but it will probably work to reassign it to a generic function. Something like:

>>> import sys
>>> 
>>> def my_global_exception_handler(type, value, traceback):
...     print traceback
...     sys.exit()
... 
>>> sys.excepthook = my_global_exception_handler
>>> raise
<traceback object at 0xb7cfcaa4>

This exception hook function could easily be an instance method that has access to the file that needs dumping.

Ali A
A: 

You can use a generic exception handler.

logging.basicConfig( file="someFile.log", level=logging.DEBUG )
logger= logging.getLogger( __name__ )
try:
    curl = pycurl.Curl()
    curl.setopt(pycurl.URL, url)
    # etc.
    curl.perform()
    curl.close
    logger.info( "Read %s", url )
except Exception, e:
    logger.exception( e )
    print e, repr(e), e.message, e.args
    raise
logging.shutdown()

This will write a nice log that has the exception information you're looking for.

S.Lott