views:

154

answers:

3

My application needs to send/receive xml data via a tcp socket. There is no way to include any kind of a fixed-length header containing message length. As far as I understand, data transmitted over tcp can come to the receipient like this.

  1. <messa

  2. ge><content

  3. >hi</content>

  4. </message>

But somehow this never happens meaning that data sent with one Send() operation (assuming it's shorter or equal than socket buffer size) is always read completely with one Receive() operation. Is the above scenario possible given that socket buffers of the endpoints are large enough and never exceeded?

+3  A: 

Yes, it is possible.

You really can not assume that the buffer boundaries in the send() operation on one side will match with the ones seen by the corresponding recv() at the other end, even if that appears to be the case most of the time.

For example, if you're sending a lot of data, it's possible that the receiving OS will invoke TCP flow control and the sending OS will only be able to send part of a buffer. Or maybe the underlying network has a packet size limitation that requires things to be split up, or ...

David Gelhar
Data is really small. can these things happen even if data is 10 KB or less?
Yes it can. Moreover, if you're sending more than 1 message relativly close in time, your Receive might read them both, or it might read the first and half of the next one. And if there's a spurious network problem at the right time, "close in time" might be several minutes.
nos
+1  A: 

This can easily happen if there is a proxy between. If we assume there is no proxy, the client will receive the same packets as the server sends. If you send data in pieces less than TCP MSS of your link, the client will probably receive it in one piece.

However, I would not rely on this. It is easy to tell the end of an XML message by seeing the close tag (</message>), so it's easy to parse XML from a stream.

unbeli
could proxy split data into chunks any way it wants? data is really small, 10KB or less..
This is not just a proxy issue, it can happen at any time. TCP is just a stream, not packet oriented. e.g. if the network is congested, you send stuff really fast, the OS network buffers are full etc. If you happen to get all the data with one Receive, you're lucky - it should not be relied on.
nos
@user375487 yes, proxy can split in any way it wants. Also, 10K is over TCP MSS, so receiver may receive smaller pieces if network is slow enough.
unbeli
@nos no congested network can pass half of TCP packet. But you are right, one should not rely on this, that's the main part of the answer.
unbeli
I'm just saying if the network is congested, and you write a message of 1000 bytes, the tcp stack might decide to send out 100 of those bytes in a segment, 100 bytes later and so on. Or if the tcp window is closed, and you write 10 messages of 100 bytes, those might be buffered up, and sent as a 1000 byte segment when the tcp window opens again.
nos
A: 

Hi,

You can include the message length in your messages. All you have to do is, when you send the xml msg you prepend it with the msg length in the first 4 bytes and then the xml msg. When you receive you take the first 4 bytes of the stream as the msg length and then read each byte for the xml msg

I assume, because of the question.
My bad then. Because I though you said "There is no way to include any kind of a fixed-length header containing message length"
Yes, that's what I said. Only xml data must be transmitted, this is the requirement I can't change. You can't prepend http requests or emails with any data you want unless you don't want them parsed, right? This is a similar case.
Depends. In http, the length of the message is send in the content-length header or the responce is send in transfer-chunked encoding. TCP is a stream-oriented protocol and you can not know if you have read the last byte of the stream or there is more in the way. In http that you mention the size is send in the way I mentioned. For your case, if you send only the xml message over tcp you can not know if you read the whole reply, especially if you plan to receive big xml documents
Http Content-Length as well as all other http headers are a part of specification. In my case specification is different and implies xml messages of variable length only. As I received answer to my question, it's evident that parsing such messages will not be trivial, but nevertheless since each request and response will be wrapped into a single xml element, this is not impossible.
Content-length is required because it is not possible to detect the end of a message in a stream-oriented protocol like TCP. In HTTP 1.0 for instance this was solved by closing the TCP question thus forcing a -1 in the stream. Anyway I am just mentioning this information in case you need it.
Thanks, man. Anyway, it was not me who downvoted you. Even if I wanted to I would lack reputation. :)