views:

411

answers:

2

I've written a server in Python that is meant to send data to the client in the form "Header:Message"

I would like to be able to have each message sent individually so that the client will need to perform minimal work in order to read the "header" and the "message"

Unfortunately, I can't figure out how to properly flush a python socket so when I have multiple sends execute in quick succession the messages get lumped together in the socket buffer and sent as one big chunk.

Example:

Server sends...

socket.send ("header1:message1")
socket.send ("header2:message2")
socket.send ("header3:message3")

Client receives... "header1:message1header2:message2header3:message3"

I'd like to receive three individual messages

header1:message1
header2:message2
header3:message3

I need a way to flush after each send

+10  A: 

I guess you are talking over a TCP connection.

Your approach is flawed. A TCP stream is defined as a stream of bytes. You always have to use some sort of separator and may not rely on the network stack to separate your messages.

If you really need datagram based services switch to UDP. You'll need to handle retransmission yourself in that case.

To clarify:

Flushing the send buffer usually creates new packages, just as you expect. If your client reads these packages fast enough you may get one message per read.

Now imagine you communicate over a satellite link. Because of high bandwidth and latency, the last router before the sat waits a short time until enough data is in the buffer and sends all your packets at once. Your client will now receive all packets without delay and will put all the data in the receive buffer at once. So your separation is gone again.

ebo
+1: I would add for edumacations' sake that, even if @Jah was to flush the sockets, his client would still receive the messages exactly the same way--flushing is about clearing buffers in the name of latency reduction, and *not* demarcating messages.
Stu Thompson
Gave a scenario to show how this may go wrong. thx @ stu
ebo
+1: TCP is a stream. Timing of TCP data is specifically eliminated by the internet routing and TCP protocol. The data must be buffered; "flushing" doesn't actually mean much. All it means is that your data is out of your app's buffers and into the TCP protocol buffer.
S.Lott
A: 

What you are trying to do, is split your data into "batches".

For example, you are operating on "batches" whenever you read "lines" off a file. What defines a "line"? It's a sequence of bytes terminated by '\n'. Another example is: you read 64KiB "chunks" off a file. What defines a "chunk"? You do, since you read 65536 bytes every time. You want a variable length "chunk"? You just prefix your "chunk" with its size, then read the "chunk". "aiff" files (whose implementations are also the .wav and .avi files of MS Windows) and "mov" files are organized like that.

These three methods are the most fundamental methods to organize a stream of bytes, whatever the medium:

  1. record separators
  2. fixed size records
  3. records prefixed with their size.

They can be mixed and/or modified. For example, you could have "variable record separators", like an XML reader: read bytes from first '<' until first '>', add a slash after first '<' and call it end-of-record, read stream until end-of-record. That's just a crude description.

Choose a method, and implement it in both the writer and reader. If you also document your choices, you've just defined your first protocol.

ΤΖΩΤΖΙΟΥ