tags:

views:

395

answers:

1

Hello,

I'm trying to find concrete examples of how to manage breaking an incoming stream of data on a TCP/IP socket and aggregating this data in a buffer of some sort so that I could find the messages in it (variable length with header + delimiters) and extract them to reconstruct the messages for the receiving application.

Any good pointers/links/examples on an efficient way of doing this would be appreciated as I couldn't find good examples online and I'm sure this problem has been addressed by others in an efficient way in the past.

  • Efficient memory allocation of aggregation buffer
  • Quickly finding the message boundaries of a message to extract it from the buffer

Thanks

David

+2  A: 

I've found that the simple method works pretty well.

  • Allocate a buffer of fixed size double the size of your biggest message. One buffer. Keep a pointer to the end of the data in the buffer.
  • Allocation happens once. The next part is the message loop:
  • If not using blocking sockets, then poll or select here.
  • Read data into the buffer at the end-data pointer. Only read what will fit into the buffer.
  • Scan the new data for your delimiters with strchr. If you found a message:
    • memcpy the message into its own buffer. (Note: I do this because I was using threading and you probably should too.)
    • memmove the remaining buffer data to the beginning of the buffer and update the end of data pointer.
    • Call the processing function for the message. (Send it to the thread pool.)

There are more complicated methods. I haven't found them worth the bother in the end but you might depending on circumstances.

You could use a circular buffer with beginning and end of data pointers. Lots of hassle keeping track and computing remaining space, etc.

You could allocate a new buffer after finding each message. You wouldn't have to copy so much data around. You do still have to move the excess data into a new message buffer after finding the delimiter.

Do not think that dumb tricks like reading one byte at a time out of the socket will improve performance. Every system call round-trip makes an 8 kB memmove look cheap.

Zan Lynx
Thanks. Looks like a good solution to explore.
David