views:

1286

answers:

4

Hi,

I'm sending a large amount of data in one go between a client and server written C#. It works fine when I run the client and server on my local machine but when I put the server on a remote computer on the internet it seems to drop data.

I send 20000 strings using the socket.Send() method and receive them using a loop which does socket.Receive(). Each string is delimited by unique characters which I use to count the number received (this is the protocol if you like). The protocol is proven, in that even with fragmented messages each string is correctly counted. On my local machine I get all 20000, over the internet I get anything between 17000-20000. It seems to be worse the slower connection that the remote computer has. To add to the confusion, turning on Wireshark seems to reduce the number of dropped messages.

First of all, what is causing this? Is it a TCP/IP issue or something wrong with my code?

Secondly, how can I get round this? Receiving all of the 20000 strings is vital.

Socket receiving code:

private static readonly Encoding encoding = new ASCIIEncoding();
///...
while (socket.Connected)
{
    byte[] recvBuffer = new byte[1024];
    int bytesRead = 0;

    try
    {
        bytesRead = socket.Receive(recvBuffer);
    }
    catch (SocketException e)
    {
    if (! socket.Connected)
    {
        return;
    }
    }

    string input = encoding.GetString(recvBuffer, 0, bytesRead);
    CountStringsIn(input);
}

Socket sending code:

private static readonly Encoding encoding = new ASCIIEncoding();
//...
socket.Send(encoding.GetBytes(string));
+1  A: 

How long are each of the strings? If they aren't exactly 1024 bytes, they'll be merged by the remote TCP/IP stack into one big stream, which you read big blocks of in your Receive call.

For example, using three Send calls to send "A", "B", and "C" will most likely come to your remote client as "ABC" (as either the remote stack or your own stack will buffer the bytes until they are read). If you need each string to come without it being merged with other strings, look into adding in a "protocol" with an identifier to show the start and end of each string, or alternatively configure the socket to avoid buffering and combining packets.

Matthew Iselin
Good call, Socket.NoDelay = true is better than Socket.SetSocketOption with the NoDelay option, do it this way instead of the way I suggested in my answer using Socket.SetSocketOption
Jeff Tucker
+3  A: 

It's definitely not TCP's fault. TCP guarantees in-order, exactly-once delivery.

Which strings are "missing"? I'd wager it's the last ones; try flushing from the sending end.

Moreover, your "protocol" here (I'm taking about the application-layer protocol you're inventing) is lacking: you should consider sending the # of objects and/or their length so the receiver knows when he's actually done receiving them.

DarkSquid
OK, 2 downvotes...what'd I get wrong here?
DarkSquid
There's no such thing as flushing from the sending end with Socket.Send(). You can force it to send a PSH flag which asks the receiver nicely to stop buffering and send everything it has in its buffer to the application layer but that's about it and that's hard to do. If strings are "missing" the only possibility is that they're being buffered in winsock and you're not reading them for some reason. Also, if winsock's buffer is exceeded prior to you reading it and it can't buffer new input, it will usually send a RST packet to the sender
Jeff Tucker
And I didnt downvote you, I'm just suggesting a possible problem with your answer but everything else about it is fine
Jeff Tucker
@Jeff Tucker: I believe NetworkStream has a flush method.
dboarman
+1  A: 

Well there's one thing wrong with your code to start with, if you're counting the number of calls to Receive which complete: you appear to be assuming that you'll see as many Receive calls finish as you made Send calls.

TCP is a stream-based protocol - you shouldn't be worrying about individual packets or reads; you should be concerned with reading the data, expecting that sometimes you won't get a whole message in one packet and sometimes you may get more than one message in a single read. (One read may not correspond to one packet, too.)

You should either prefix each method with its length before sending, or have a delimited between messages.

Jon Skeet
Apologies - I've over-simplified my code example - I'm not counting the completed Receive calls. The strings are delimited and I count the number received in each buffer, accounting for ones which spread over 2 calls to Receive.
Nosrama
In that case please post a short but complete program which demonstrates the problem.
Jon Skeet
... so we can answer your problem without needing to resort to our crystal balls ;)
Matthew Iselin
+2  A: 

If you're dropping packets, you'll see a delay in transmission since it has to re-transmit the dropped packets. This could be very significant although there's a TCP option called selective acknowledgement which, if supported by both sides, it will trigger a resend of only those packets which were dropped and not every packet since the dropped one. There's no way to control that in your code. By default, you can always assume that every packet is delivered in order for TCP and if there's some reason that it can't deliver every packet in order, the connection will drop, either by a timeout or by one end of the connetion sending a RST packet.

What you're seeing is most likely the result of Nagle's algorithm. What it does is instead of sending each bit of data as you post it, it sends one byte and then waits for an ack from the other side. While it's waiting, it aggregates all the other data that you want to send and combines it into one big packet and then sends it. Since the max size for TCP is 65k, it can combine quite a bit of data into one packet, although it's extremely unlikely that this will occur, particularly since winsock's default buffer size is about 10k or so (I forget the exact amount). Additionally, if the max window size of the receiver is less than 65k, it will only send as much as the last advertised window size of the receiver. The window size also affects Nagle's algorithm as well in terms of how much data it can aggregate prior to sending because it can't send more than the window size.

The reason you see this is because on the internet, unlike your network, that first ack takes more time to return so Naggle's algorithm aggregates more of your data into a single packet. Locally, the return is effectively instantaneous so it's able to send your data as quickly as you can post it to the socket. You can disable Naggle's algorithm on the client side by using SetSockOpt (winsock) or Socket.SetSocketOption (.Net) but I highly recommend that you DO NOT disable Naggling on the socket unless you are 100% sure you know what you're doing. It's there for a very good reason.

Jeff Tucker
Also, counting the strings might not be the best way to determine what's going on. Count the total number of bytes received and compare to bytes sent (be sure to count the bytes sent also). If they're different, something wierd is going on that I'll have to ask the winsock team about. If they're the same, your string parsing is broken for strings that span multiple buffers.
Jeff Tucker
Now counting bytes - thanks
Nosrama
This was the most informative answer and helped me rule out TCP/IP and winsocks as the problem.
Nosrama
So what *was* the problem?
GraemeF