tags:

views:

373

answers:

4

I am using domain socket to get values from another process, like A to get a value from B, It works well for months. But recently, A is failed during "sendto" message to B with "errno 111, connection refused" occasionally.

I checked the B domain socket bind file, it is exists. I also do some tests in another machine, also works well. So, does anyone encounter this problem before? Can anyone have some clues what might be probably wrong in this scenario? Thanks very much.

+1  A: 

When I've seen this error with unix domain sockets, it's usually because the process B isn't running, or there is a mismatch in the connection paths. (If B dies, does it automatically restart? Is it possible that the failures are happening while B has died but not yet restarted?). Another possibility: is it possible that multiple copies of A are running at the same time? You may get the ECONNREFUSED error if B's queue of not-yet-accepted connections is full.

I would suggest running both processes A and B under strace, either:

strace -o A.log A

or, if the process is already running,

strace -o B.log -p <process-id-of-B>

Also,

netstat -na

will give you the status of all unix domain sockets present in the system.

psmears
`netstat -nap` (when run as root) will also show the processes connected to those sockets.
bstpierre
A: 

Keep in mind that sockets in the file system are not deleted automatically when the last descriptor to them is closed. Trying to connect or send at that time will cause errors. The server will need to remove the socket in the file system before it can bind again.

jilles
+1  A: 

Consider looking under /proc/<pid-B>/fd and see whether B is running out of file descriptors. If so, you have a resource leak and need to clean up. It shouldn't be a problem with UDP programs, but funnier things have been known. lsof might be another tool to use.

Otherwise, you have reasonable suggestions from other people - netstat in particular should help.

Jonathan Leffler
A: 

Process B is no longer on the other side of your (presumably DGRAM) socket -- perhaps it died, or closed the filehandle, etc.

sendto(2) on linux will return ECONNREFUSED for a SOCK_DGRAM or SOCK_SEQPACKET unix domain socket if the receiving end is dead. (SOCK_STREAM unix sockets will not do this -- they will instead return ENOTCONN.)

pilcrow