I am interested in getting the intermediate URLs in a redirect chain using pycURL. So, say I have a website, Site A, which redirects to Site B, which then redirects to Site C. Regularly I would only be able to see Site A (the starting URL) and Site C (the ending URL), however I am also interested in any sites that happen to reside in between the starting and ending site (in this case Site B). How would I go about doing this?
A:
Have a look to PyCurl Callbacks:
## Callback function invoked when header data is ready
def header(buf):
import sys
sys.stdout.write(buf)
# Returning None implies that all bytes were written
c = pycurl.Curl()
c.setopt(pycurl.URL, "http://www.siteA.com/")
c.setopt(pycurl.HEADERFUNCTION, header)
c.perform()
systempuntoout
2010-08-29 23:21:00