tags:

views:

42

answers:

2
def download(source_url):
    socket.setdefaulttimeout(10)
    agents = ['Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)','Mozilla/4.0 (compatible; MSIE 7.0b; Windows NT 5.1)','Microsoft Internet Explorer/4.0b1 (Windows 95)','Opera/8.00 (Windows NT 5.1; U; en)']
    ree = urllib2.Request(source_url)
    ree.add_header('User-Agent',random.choice(agents))
    resp = urllib2.urlopen(ree)
    htmlSource = resp.read()
    return htmlSource


url = "http://myIP/details/?id=4"
result_html = download(url)

It shouldn't time out...even with the 10 second timeout..

A: 

When you say "your own domain", are you hitting it from inside a NAT firewall?

Something like this?

123.1.2.3 (public myIP) <- NAT gateway -> 192.168.1.5 (private IP of server) <--> 192.168.1.10 (you)

Many firewalls don't allow an internal address to access an external interface without DNS rewrite. DNS rewrite causes the firewall to intercept your DNS lookup and rewrite the resulting public IP with the private IP.

mrjoltcola
+1  A: 

This will fail if you're running the development server, since it's single-threaded and it's busy serving the original request. Use mod_wsgi or strap on something like CherryPy if you want it to work.

Ignacio Vazquez-Abrams
Thanks, this is correct.
TIMEX