views:

1706

answers:

5

Hopefully someone can help us as we're reaching as far as investigation can go!

We've got a simple asynchronous socket server written in C# that accepts connections from an ASP.NET web application, is sent a message, performs some processing (usually against a DB but other systems too) and then sends a response back to the client. The client is in charge of closing the connection.

We've been having issues where if the system is under heavy load over a long period of time (days usually), CLOSE_WAIT sockets build up on the server box (netstat -a) to an extent that the process will not accept any further connections. At that point we have to bounce the process and off it runs again.

We've tried running some load tests of our ASP.NET application to attempt to replicate the problem (because inferring some issue from the code wasn't possible). We think we've managed this and ended up with a WireShark packet trace of the issue manifesting itself as a SocketException in the socket server's logs:

System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host at System.Net.Sockets.Socket.BeginSend(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags, AsyncCallback callback, Object state)

I've tried to reproduce the issue from the packet trace as a single threaded process directly talking to the socket server (using the same code the ASP.NET app does) and am unable.

Has anybody got any suggestions of next things to try, check for or obvious things we may be doing wrong?

+2  A: 

The client is in charge of closing the connection.

Both the client and the server must close and Shutdown the socket. Either the client is not finishing the close (unlikely - since it'd have it's finalizer run) or the server is not shutting down the socket (likely).

using (Socket s = new Socket(/* */)) {
  /* Do stuff */
  s.Shutdown(SocketShutdown.Both);
  s.Close();
}
Mark Brackett
On the client side the socket is closed as part of a using(..) block - but we do not at this time do .Shutdown and .Close explicitly - which isnt a problem with normal testing. The server explicitly does both in all code paths we can find (its complicated because its async).
Kieran Benton
@Kieran - the fact that bouncing the server process clears the CLOSE_WAITs indicates that you're not closing somewhere, I think.
Mark Brackett
A: 

CLOSE_WAIT's are meant to hang around for a while after a socket is closed, to prevent re-using the same socket number and receiving packets from the old connection. This will only give you grief if you're opening and closing a huuuuge number of sockets really quickly.

EDIT - It should be TIME_WAIT, not CLOSE_WAIT above.

Chris
They can hang around for a lot longer than that if for some reason the connection gets wedged, see: http://blog.zhuzhaoyuan.com/2009/03/a-word-on-time_wait-and-close_wait/. Its not a natural thing like TIME_WAIT.
Kieran Benton
Am i getting close_wait and time_wait confused or something?
Chris
You're thinking of TIME_WAIT Chris.
Len Holgate
A: 

You shouldn't be leaving the responsibility of closing the TCP sockets only up to the client. What happens if the client process/machine crashes?

Ideally you should have a timeout in place so that if no traffic is received on a connected socket after a certain amount of time then it gets closed by the server.

sipwiz
+2  A: 

If your server is accumulating CLOSE_WAIT sockets then it's not closing its socket when the connection is complete. If you take a look at the state diagram in the comment to Chris' post you'll see that CLOSE_WAIT transitions to LAST_ACK once the socket is closed and the FIN has been sent.

You say that it's complex to determine where to do this due to the async nature? This shouldn't be a problem, you should close the socket if the callback from your recv returns 0 bytes (assuming you have nothing else to do once your client closes its side of the connection). If you do need to worry about continuing to send then do a Shutdown(recv) here and make a note that your client has closed, once you're done sending do a Shutdown(send) and a Close.

You MAY be issuing a new read in the callback from the read which returns 0 indicating that the client has closed and this may be causing you problems?

Len Holgate
+1  A: 

Look at the diagram

http://en.wikipedia.org/wiki/File:Tcp_state_diagram_fixed.svg

Your client closed the connection by calling close(), which sent FIN to the server socket, which ACKed the FIN and the state of which now changed to CLOSE_WAIT, and stays that way unless the server issues close() call on that socket.

Your server program needs to detect whether the client has aborted the connection, and then close() it immediately to free up the port. How? Refer to read(). Upon reading end-of-file (meaning FIN is received), zero is returned.

yogman