views:

31

answers:

1

I am interested in getting the intermediate URLs in a redirect chain using pycURL. So, say I have a website, Site A, which redirects to Site B, which then redirects to Site C. Regularly I would only be able to see Site A (the starting URL) and Site C (the ending URL), however I am also interested in any sites that happen to reside in between the starting and ending site (in this case Site B). How would I go about doing this?

A: 

Have a look to PyCurl Callbacks:

## Callback function invoked when header data is ready
def header(buf):
    import sys
    sys.stdout.write(buf)
    # Returning None implies that all bytes were written

c = pycurl.Curl()
c.setopt(pycurl.URL, "http://www.siteA.com/")
c.setopt(pycurl.HEADERFUNCTION, header)
c.perform()
systempuntoout