views:

47

answers:

1

I have in my application a failure that arose which does not seem to be reproducible. I have a TCP socket connection which failed and the application tried to reconnect it. In the second call to connect() attempting to reconnect, I got an error result with errno == EADDRNOTAVAIL which the man page for connect() says means: "The specified address is not available from the local machine."

Looking at the call to connect(), the second argument appears to be the address to which the error is referring to, but as I understand it, this argument is the TCP socket address of the remote host, so I am confused about the man page referring to the local machine. Is it that this address to the remote TCP socket host is not available from my local machine? If so, why would this be? It had to have succeeded calling connect() the first time before the connection failed and it attempted to reconnect and got this error. The arguments to connect() were the same both times.

Would this error be a transient one which, if I had tried calling connect again might have gone away if I waited long enough? If not, how should I try to recover from this failure?

+2  A: 

Check this link

http://www.toptip.ca/2010/02/linux-eaddrnotavail-address-not.html

EDIT: Yes I meant to add more but had to cut it there because of an emergency

Did you close the socket before attempting to reconnect? Closing will tell the system that the socketpair (ip/port) is now free.

Here are additional items too look at:

  • If the local port is already connected to the given remote IP and port (i.e., there's already an identical socketpair), you'll receive this error (see bug link below).
  • Binding a socket address which isn't the local one will produce this error. if the IP addresses of a machine are 127.0.0.1 and 1.2.3.4, and you're trying to bind to 1.2.3.5 you are going to get this error.
  • EADDRNOTAVAIL: The specified address is unavailable on the remote machine or the address field of the name structure is all zeroes.

Link with a bug similar to yours (answer is close to the bottom)

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4294599

It seems that your socket is basically stuck in one of the TCP internal states and that adding a delay for reconnection might solve your problem as they seem to have done in that bug report.

David
link is helpful, but just putting in the link is not, especially when so many links become stale and useless.
Greg Domjan
There you go :) good answer, David.
Santiago Lezica