Not seeing your code I'll have to guess.
The reason you get a Zero window in TCP is because there is no room in the receiver's recv buffer.
There are a number of ways this can occur. One common cause of this problem is when you are sending over a LAN or other relatively fast network connection and one computer is significantly faster than the other computer. As an extreme example, say you've got a 3Ghz computer sending as fast as possible over a Gigabit Ethernet to another machine that's running a 1Ghz cpu. Since the sender can send much faster than the receiver is able to read then the receiver's recv buffer will fill up causing the TCP stack to advertise a Zero window to the sender.
Now this can cause problems on both the sending and receiving sides if they're not both ready to deal with this. On the sending side this can cause the send buffer to fill up and calls to send either to block or fail if you're using non-blocking I/O. On the receiving side you could be spending so much time on I/O that the application has no chance to process any of it's data and giving the appearance of being locked up.
Edit
From some of your answers and code it sounds like your app is single threaded and you're trying to do non-Blocking sends for some reason. I assume you're setting the socket to non-Blocking in some other part of the code.
Generally, I would say that this is not a good idea. Ideally, if you're worried about your app hanging on a send(2)
you should set a long timeout on the socket using setsockopt
and use a separate thread for the actual sending.
See socket(7):
SO_RCVTIMEO and SO_SNDTIMEO
Specify the receiving or sending timeouts until reporting an error. The
parameter is a struct timeval. If an
input or output function blocks for
this period of time, and data has been
sent or received, the return value of
that function will be the amount of
data transferred; if no data has been
transferred and the timeout has been
reached then -1 is returned with errno
set to EAGAIN or EWOULDBLOCK just as
if the socket was specified to be
nonblocking. If the timeout is set to
zero (the default) then the operation
will never timeout.
Your main thread can push each file descriptor into a queue
using say a boost mutex for queue access, then start 1 - N threads to do the actual sending using blocking I/O with send timeouts.
Your send function should look something like this ( assuming you're setting a timeout ):
// blocking send, timeout is handled by caller reading errno on short send
int doSend(int s, const void *buf, size_t dataLen) {
int totalSent=0;
while(totalSent != dataLen)
{
int bytesSent
= send(s,((char *)data)+totalSent, dataLen-totalSent, MSG_NOSIGNAL);
if( bytesSent < 0 && errno != EINTR )
break;
totalSent += bytesSent;
}
return totalSent;
}
The MSG_NOSIGNAL
flag ensures that your application isn't killed by writing to a socket that's been closed or reset by the peer. Sometimes I/O operations are interupted by signals, and checking for EINTR
allows you to restart the send
.
Generally, you should call doSend
in a loop with chunks of data that are of TCP_MAXSEG
size.
On the receive side you can write a similar blocking recv function using a timeout in a separate thread.