MSDN states that Socket.Shutdown can throw a SocketException
. I've had this happen to me in production recently after introducing a load balancer between my clients and my server. But I cannot reproduce it in testing without a load balancer. Can you?
Some background - I have a server application written in C# that uses TCP
sockets to communicate with clients. The application protocol is very simple for the server: accept connection, read request, send response, wait for client shutdown (read expecting 0 bytes), shutdown.
This code has been in production without issue for many years. However after introducing a load balancer in front of multiple server machines one of the server processes crashed due to an unhandled SocketException
that was raised when the server called Socket.Shutdown
. The particular client had timed out whilst waiting for the server to respond and attempted to close the connection early. The exception message on the server was "An existing connection was forcibly closed by the remote host." It is not unusual for the client to do this, but obviously prior to the load balancer the server was raising this error at a different point in the code. Still it's clearly a server bug and the fix is obvious - handle the exception.
However using a test client application (also written in C#), I cannot find a sequence of operations that will cause the server to raise an exception during Socket.Shutdown
. It appears that the load balancer did something unusual to the TCP
packets, but still, I dislike using that as excuse for failing to reproduce the issue.
I can run both server and client code in debug and I have WireShark watching the packets.
On the client side, after the connection is established, the operations are:
Socket.Send() // single call
Socket.Receive() // this one times out in our scenario
Socket.XXX() // various choices as described below
On the server side, after the connection is established, the operations are:
1) Socket.Receive() //multiple calls until complete message is received
2) // Processing...
3) Socket.Write() //single call
4) Socket.Receive() // single call expecting 0 bytes
5) Socket.Shutdown()
Presume each call is wrapped with try..catch(SocketException)
A) If I pause the server during step 2, wait for the client to time out, and initiate a client shutdown using Socket.Shutdown(SocketShutDown.Send)
a FIN packet is sent to the server. When the server resumes processing, all the calls will succeed (3 thru 5) because that's a perfectly acceptable TCP flow.
B) If I pause the server during step 2, wait for the client to time out, and initiate a client shutdown using Socket.Shutdown(SocketShutDown.Both)
or Socket.Close()
again a FIN packet is sent to the server. When the server resumes processing step 3 succeeds, but it causes the client to send a RST packet in response as it is not accepting more data. If this RST arrives before step 4 then Socket.Receive
throws and step 5 succeeds. If it arrives after step 4, then Socket.Receive
succeeds (returns 0 bytes), and yet step 5 succeeds.
C) If the client has "Dont Linger" set (Linger enabled with 0 timeout), and I pause the server during processing, wait for the client to time out, and initiate a client shutdown using Socket.Shutdown(SocketShutDown.Both)
or Socket.Close()
a "RST" packet is immediately sent to the server. When the server resumes processing steps 3 and 4 will fail but still step 5 succeeds.
I think what puzzles me most is that Socket.Shutdown
appears to ignore my test client RST packets and yet evidently my load balancer was able to send a RST packet that was not ignored. What am I missing? What else can I try?