Here is a related question but I could not figure out how to apply the answer to mechanize/urllib2: http://stackoverflow.com/questions/1540749/how-to-force-python-httplib-library-to-use-only-a-requests
Basically, given this simple code:
#!/usr/bin/python
import urllib2
print urllib2.urlopen('http://python.org/').read(100)
This results in wireshark saying the following:
0.000000 10.102.0.79 -> 8.8.8.8 DNS Standard query A python.org
0.000023 10.102.0.79 -> 8.8.8.8 DNS Standard query AAAA python.org
0.005369 8.8.8.8 -> 10.102.0.79 DNS Standard query response A 82.94.164.162
5.004494 10.102.0.79 -> 8.8.8.8 DNS Standard query A python.org
5.010540 8.8.8.8 -> 10.102.0.79 DNS Standard query response A 82.94.164.162
5.010599 10.102.0.79 -> 8.8.8.8 DNS Standard query AAAA python.org
5.015832 8.8.8.8 -> 10.102.0.79 DNS Standard query response AAAA 2001:888:2000:d::a2
That's a 5 second delay!
I don't have IPv6 enabled anywhere in my system (gentoo compiled with USE=-ipv6
) so I don't think that python has any reason to even try an IPv6 lookup.
The above referenced question suggested explicitly setting the socket type to AF_INET
which sounds great. I have no idea how to force urllib or mechanize to use any sockets that I create though.
EDIT: I know that the AAAA queries are the issue because other apps had the delay as well and as soon as I recompiled with ipv6 disabled, the problem went away... except for in python which still performs the AAAA requests.