tags:

views:

64

answers:

5

I've seen several uses of sockets where programmers send a command or some information over a TCP/IP socket, and expect it to be received in one call on the receiving side.

For eg, transmitting

mySocket.Send("SomeSpecificCommand")

They assume the receive side will receive all the data in one call. For eg:

Dim data(255) As Byte   
Dim nReceived As Long = s.Receive(data, 0, data.Count, SocketFlags.None)
Dim str As String = Encoding.ASCII.GetString(data, 0, n)
If str = "SomeSpecificCommand" Then
    DoStuff()
    ...

The example above doesn't use any terminator, so the programmer is relying on the fact that the sockets implementation is not allowed, for example, to return "SomeSpecif" in a first call to Receive(), and "cCommand" in a later call to Receive(). (NOTE - In the example, the buffer is sized to be larger than the expected string).

I've never before given this much thought and had just assumed that this type of coding is unsafe and have always used delimiters. Have I been wasting my time (and processor cycles)?

+4  A: 

There is no guarantee that it will all arrive at the same time. The code (the app's protocol) needs to deal with the possibility that data from one send may arrive in multiple pieces or the possibility that data from more than one send could arrive in one receive.

dgnorton
Thanks - good to know I've not been wasting time. The thing that made me wonder is that in USB device drivers this assumption is correct and should be used since it is a lot more efficient.
ttt
It would be great to have a reference. Do you know of anywhere in the socket documentation where this is stated?
ttt
Even if it seems a lot more efficient some environments require multiple response. Think of a procedure where you request login, the server responds says you've logged in, then it responds saying that the user has new messages. To the server it is better that log in and new messages be its own procedure with its own response. That and most server socket applications are lazy and work in que's and send one part of the request at a time, instead of preparing the full request and sending it out.
Justin
http://stackoverflow.com/questions/3824698/guarantee-tcp-packet-size or http://en.wikipedia.org/wiki/Transmission_Control_Protocol
dgnorton
A: 

No, its not a good idea to assume that the server (assuming your the client) is gonna only send you one socket response. The server could be running though a list of procedures that returns multiple results. I would continue to read from the socket until there is nothing left to pick up, then wait a few miliseconds and test again. If nothing shows up, chances are good that the server has finished sending responses.

Justin
I had the same assumption until I was working with telnet servers. I had to rewrite the majority of my telnet class.
Justin
And what if the other side sent two separate messages back-to-back? I've seen all of message-1 and part of message-2 arrive on the first receive and the rest of message-2 arrive on the next receive.
dgnorton
I've always received the separate messages one at a time. Ensure that you are only reading the bytes that are available for each response otherwise you may be reading the next item in the socket 'queue' like structure.
Justin
You also might have an End Of Stream or like marker, to help separate the responses.
Justin
My guess is that you can get away with it fine in many cases, but that it depends on the socket implementations, OS, network infrastructure and probably other factors. Does anyone know of somewhere in the documentation where it clearly states one way or the other?
ttt
Sockets is more a peice used by a server it depends on the server utilization of it. Some servers can use a push/pull like approach where the client requests information and the server always returns one response per request, but most socket servers will want to interact with the client. If its an interactive process than in most cases you best bet is to assume more responses for each request. Lastly it is always a good idea to check for a response before you send out a request, just in case the server has returned a 'I'm busy for now' or 'no more requests please' message.
Justin
You should never rely on TCP preserving message boundaries. There are lots of things that could go wrong, e.g: esome proxy between your endpoints may rearrange messages, if you send a message larger than the MSS your TCP/IP stack will send it in parts, ...
ninjalj
A: 

There are several types of sockets. TCP uses SOCK_STREAM, which don't preserve message boundaries. SOCK_SEQPACKET sockets do preserve message boundaries.

EDIT: SCTP supports both SOCK_STREAM and SOCK_SEQPACKET.

ninjalj
+1  A: 

Short snippets of data sent in one short call to send() will usually arrive in one call to recv(), which is why code like that will work most of the time. However, it's not guaranteed and therefore bad practice to rely on it.

TCP buffers the data and may split it up as it sees fit. TCP tries to send as few packets as possible to conserve bandwidth, so it won't split up the data for no good reason. However, if it's been queueing up some data and the data from one call to send() happens to straddle a packet boundary, that data will be split up.

Alternately, TCP could try to send it in one packet, but then a router anywhere along the path to the destination could come back and say "this packet is too big!". Then TCP will split it into smaller packets.

Daniel Stutzbach
+1  A: 

When sending data across a network, you should expect your data to be fragmented across multiple packets and structure your code and data to deal with this. In the example case where you are sending a handful of bytes, everything will work fine.. until you start sending larger packets.

If you are expecting to receive one message at a time then you can just loop reading bytes for an interval after the first bytes arrive. This is simple but inefficient.

A delimiter could be used as suggested but then you have to guard against accidentally including the delimiter within the regular data. If you are only sending text then you can use null or some non-printable character. If you are sending binary data then this becomes more difficult as any occurrence of the delimiter within the data needs to be escaped by the sender and un-escaped by the receiver.

An alternative to delimiters is to add a field to the front of the data containing a message length. This is better than using a delimiter as it removes the need for escaping data and better than simply looping until a timer expires as it will be more responsive.

Martin Thomas
My current solution sounds pretty close to this - I respond to receive events and append received data to a linked list. My program then pulls out CRLF separated plain text commands from the list. The protocol allows these plain-text commands to be followed by binary data, as long as the byte count is specified in the plain text command. My concern had been that this was totally OTT, but it sounds ok. I wonder how many global man-hours have gone into implementing this kind of scheme...!
ttt