tags:

views:

103

answers:

2

What's a good end of message marker for a socket message schema in order to separate messages as they are received?

I had been using <EOF> but that seems a byte or too long and could POSSIBLY be sent in a message, especially if XML data was being sent.

Thanks!

+3  A: 

One method is to approach this similar to AMF3: Before each message, send a 4-byte length indicating the number of bytes of data which will be sent as the message. In this way, even a 0-byte "empty message" can be sent, and no escape mechanism is needed.

Heath Hunnicutt
I would recommend 8 bytes. Bytes are cheap, and once you limit yourself to 4 bytes, you can't back out without changing the protocol.
SLaks
4-bytes are already allowing up to 4GB messages. I don't think the overhead of having to break up your multi-terabyte messages every 4GB is going to add up to much :)
bdonlan
Excellent idea! Thanks! :DWhat about speed considerations for alot of short messages? Does it not make that big of a difference?
bobber205
If you are certain that you have all shorter messages, then you might be able to reduce the number of bytes used. The speed of the result will depend on what it is that limits the speed: is the limitation communications speed (e.g. modem), or CPU speed while handling the data? If modem speed, use fewer bytes, if CPU speed, use either 1 byte or the natural size of the CPU register, which is 4 bytes for 32 bit and 8 bytes for 64 bit. Usually. Results vary based on details.
Heath Hunnicutt
+1  A: 

If you're restricting the message data to printable characters, there are several control characters to choose from (ETX, EOT, Ctrl-Z, FS, EM, etc.) that historically have been used to signal end of message.

Loadmaster
I am not restricting it to printable characters, but I will keep these in mind. :)
bobber205