tags:

views:

195

answers:

5

Hello everybody,

I am writing an application in C, using libpcap. My program listens for new packets and parses them according to a grammar. The payload actually is XML.

Sometimes one packet is not enough for an XML file, so the XML buffer is splitted into separate packets. I want to add code logic in order to handle these cases. However I don't know in advance that a packet does not contain the whole data. How do I know that a packet has more data that will be send next? How to i recognize that a new packet contains the rest of the data?

Do I have to use the TH_FIN flag? Could you please explain it to me?

Thanks, cateof

A: 

If you're using TCP, use a TCP library that gives you the data as a stream instead of trying to handle the packets yourself.

Anon.
A: 

Stream is good. Another option is to store the incoming data in a buffer (eg char*) and search for application messaging framing characters or in the case of Xml, the root end tag. Once you've found a complete xml message at the front of the buffer, pull it out and process.

rob_g
+1  A: 

As Anon say use a higher level stream library.

But even then you need to know the chunk side before starting to handle it, as you will read from the stream in block's of n bytes.

Thus you want to first send in binary the number of bytes to be sent, then send x bytes, and repeat, thus when you are receiving the chucks via select/read to know went you have all of chunk one to pass to the processor.

Simeon Pilgrim
That'll only work if you're the terminating end, not if you're sniffing existing connections
leeeroy
Agreed, I missed the the sniffing/snooping element, and just assumed he was doing sockets the hard way..
Simeon Pilgrim
+2  A: 

There's nothing in TCP that defines packets, that's up to the higher layers to define if they need to - TCP is just a stream.

If this is raw XML over a TCP stream, you actually need to parse the xml - you'll know when you have a whole xml document when you've received the end of the document element. If it's XML packaged over HTTP , you might be able to parse out the Content-Length: header which should contain the length of the body.

Note, reassembling a TCP stream from captured packets is a very hard problem, there's a lot of corner cases, e.g. you'd need to handle retransmission , out of sequence tcp segments and many more. http://libnids.sourceforge.net/ might help you.

nos
A: 

The XMPP instant messaging protocol, used by Jabber, has means to move XML chunks over a TCP stream. I don't know how exactly it is done myself, but RFC 3290 is the protocol definition. You should be able to work it out from that.

Andrew McGregor