views:

153

answers:

2

I have one thread that is receiving data over a socket like this:

while (sock.Connected)
{
    // Receive Data (Block if no data)
    recvn = sock.Receive(recvb, 0, rlen, SocketFlags.None, out serr);

    if (recvn <= 0 || sock == null || !sock.Connected)
    {
        OnError("Error In Receive, recvn <= 0 || sock == null || !sock.Connected");
        return;
    }
    else if (serr != SocketError.Success)
    {
         OnError("Error In Receive, serr = " + serr);
         return;
    }

    // Copy Data Into Tokenizer
    tknz.Read(recvb, recvn);

    // Parse Data
    while (tknz.MoveToNext())
    {
        try
        {
            ParseMessageAndRaiseEvents(tknz.Buffer(), tknz.Length);
        }
        catch (System.Exception ex)
        {
            string BadMessage = ByteArrayToStringClean(tknz.Buffer(), tknz.Length);
            string msg = string.Format("Exception in MDWrapper Parsing Message, 
                           Ex = {0}, Msg = {1}", ex.Message, BadMessage);
            OnError(msg);
        }
    }
}

And I kept seeing occasional errors in my parsing function indicating that the message wasn't valid. At first, I thought that my tokenizer class was broken. But after logging all the incoming bytes to the tokenizer, it turns out that the raw bytes in recvb weren't a valid message. I didn't think that corrupted data like this was possible with a tcp data stream.

I figured it had to be some type of buffer overflow so I set

sock.ReceiveBufferSize = 1024 * 1024 * 8;

and the parsing error never, ever occurs in testing (it happens often enough to replicate if I don't change the ReceiveBufferSize).

But my question is: why wasn't I seeing an exception or an error state or something if the socket's internal buffer was overflowing before I changed this buffer size?

+5  A: 

I assume your tokenizer expects text (Utf8 ?) but (socket-)streams work with byte data. It is possible that a multibyte character is split in transport. A small buffer increases the likelihood of that happening.

If you are using ASCII you're safe, otherwise the solution would lay in using a TextReader as in-between.

Henk Holterman
Thanks for the reply. I hadn't thought about that. But luckily, this is using ASCII. The packet structure for the data that I'm receiving has ascii character 31 (unit separator) as the delimiter between messages. And just capturing the raw data from recvb and writing to a file shows that the messages are invalid (at least one point in the array when two messages are clearly overlapping without a character 31 between them).
Michael Covelli
@Michael, OK, but still check your tokenizer-logic. From your code, the tokenizer is responsible for re-assembling messages. Ie it has to remember any incomplete messages and append new data after that before attempting to find the next message.
Henk Holterman
The tokenizer does re-assemble messages. There certainly might be a bug in it. But I don't think that's causing the issue that I'm seeing here. I put in logging of the recvn contents before it ever reaches the tokenizer. Just writing these bytes to a file as they come in and the messages in there are not correct (i.e. two messages are in there overlapping, not separated by ascii token 31).
Michael Covelli
+3  A: 

I would also suggest confirming that the sender of the data is checking the number of bytes successfully written, and not assuming that all bytes were written successfully.

This is a common mistake when using Socket.Send, and may explain why you see the problem going away when you up the buffer size.

It is the sender's responsibility to retry until all bytes have been successfully written.

Leon Breedt
Thanks for replying. That's a good point, it could be on the sender's side. That would explain why the socket is not reporting any errors. If the bytes are being correctly delivered but the sender is having some issues and not sending all to me because the buffer is filling up, that might be it. I guess that I should capture the packets sent by the server with wireshark to confirm.
Michael Covelli
I think your right, managed to get the person with the server to log the output today and from his log file: "send(): short send 98, should be 106."So it looks like he's not sending everything to be because my receive buffer is full. Thanks for your help!
Michael Covelli