views:

2214

answers:

4

I have a program that uses urllib to periodically fetch a url, and I see intermittent errors like :

I/O error(socket error): [Errno 111] Connection refused.

It works 90% of the time, but the othe r10% it fails. If retry the fetch immediately after it fails, it succeeds. I'm unable to figure out why this is so. I tried to see if any ports are available, and they are. Any debugging ideas?

For additional info, the stack trace is:

File "/usr/lib/python2.6/urllib.py", line 235, in retrieve

fp = self.open(url, data)

File "/usr/lib/python2.6/urllib.py", line 203, in open

return getattr(self, name)(url)

File "/usr/lib/python2.6/urllib.py", line 342, in open_http

h.endheaders()

File "/usr/lib/python2.6/httplib.py", line 868, in endheaders

self._send_output()

File "/usr/lib/python2.6/httplib.py", line 740, in _send_output

self.send(msg)

File "/usr/lib/python2.6/httplib.py", line 699, in send

self.connect()

File "/usr/lib/python2.6/httplib.py", line 683, in connect

self.timeout)

File "/usr/lib/python2.6/socket.py", line 512, in create_connection

raise error, msg

Edit - A google search isn't very helpful, what I got out of it is that the server I'm fetching from sometimes refuses connections, how can I verify its not a bug in my code and this is indeed the case?

+1  A: 

I'm not exactly sure what's causing this. You can try looking in your socket.py (mine is a different version, so line numbers from the trace don't match, and I'm afraid some other details might not match as well).

Anyway, it seems like a good practice to put your url fetching code in a try: ... except: ... block, and handle this with a short pause and a retry. The URL you're trying to fetch may be down, or too loaded, and that's stuff you'll only be able to handle in with a retry anyway.

Ofri Raviv
+1  A: 

Getting an ECONNREFUSED errno means that your kernel was refused a connection at the other end, so if it's a bug, it's either in your kernel or in the other end. What you can do is to trap the error in a very specific way and try again in a little while, since this seems to work:

# This is Python > 2.5 code
import errno, time

for attempt in range(MAXIMUM_NUMBER_OF_ATTEMPTS):
    try:
        # your urllib call here
    except EnvironmentError as exc: # replace " as " with ", " for Python<2.6
        if exc.errno == errno.ECONNREFUSED:
            time.sleep(A_COUPLE_OF_SECONDS)
        else:
            raise # re-raise otherwise
    else: # we tried, and we had no failure, so
        break
else: # we never broke out of the for loop
    raise RuntimeError("maximum number of unsuccessful attempts reached")

Replace the two all-caps constants with your favourite numbers.

ΤΖΩΤΖΙΟΥ
+3  A: 

Use a packet sniffer like Wireshark to look at what happens. You need to see a SYN-flagged packet outgoing, a SYN+ACK-flagged incoming and then a ACK-flagged outgoing. After that, the port is considered open on the local side.

If you only see the first packet and the error message comes after several seconds of waiting, the other side is not answering at all (like in: unplugged cable, overloaded server, misguided packet was discarded) and your local network stack aborts the connection attempt. If you see RST packets, the host actually denies the connection. If you see "ICMP Port unreachable" or host unreachable packets, a firewall or the target host inform you of the port actually being closed.

Of course you cannot expect the service to be available at all times (consider all the points of failure in between you and the data), so you should try again later.

Paul
Now, *that's* a really helpful answer in relation to what the OP asked. Nice start, Paul.
ΤΖΩΤΖΙΟΥ
A: 

I seem to be experiencing a similar issue, but running in Java using URL.openStream() with 4 threads. Error rate (connection refused) is about 5%.

My setup is Windows 7 x64, 4-core CPU, Realtek RTL 8168D

I tried debugging using wireshark, and got some confusing results. Basically, when all goes well I see 4 connections behaving as expected. If I get the "connection refused" I only see three normal connections. But there isn't even a fourth SYN packet to initiate the transfer.

So my best guess would be that something in the kernel/NAT driver is buggy.

OP, did you make any progress on your problem?

Kyromancer