views:

718

answers:

7

I have a rare bug that seems to occur reading a socket.

It seems, that during reading of data sometimes I get only 1-3 bytes of a data package that is bigger than this.

As I learned from pipe-programming, there I always get at least 512 bytes as long as the sender provides enough data.

Also my sender does at least transmit >= 4 Bytes anytime it does transmit anything -- so I was thinking that at least 4 bytes will be received at once in the beginning (!!) of the transmission.

In 99.9% of all cases, my assumption seems to hold ... but there are really rare cases, when less than 4 bytes are received. It seems to me ridiculous, why the networking system should do this?

Does anybody know more?

Here is the reading-code I use:

mySock, addr = masterSock.accept()
mySock.settimeout(10.0)
result = mySock.recv(BUFSIZE)
# 4 bytes are needed here ...
...
# read remainder of datagram
...

The sender sends the complete datagram with one call of send.

Edit: the whole thing is working on localhost -- so no complicated network applications (routers etc.) are involved. BUFSIZE is at least 512 and the sender sends at least 4 bytes.

+7  A: 

As far as I know, this behaviour is perfectly reasonable. Sockets may, and probably will fragment your data as they transmit it. You should be prepared to handle such cases by applying appropriate buffering techniques.

On other hand, if you are transmitting the data on the localhost and you are indeed getting only 4 bytes it probably means you have a bug somewhere else in your code.

EDIT: An idea - try to fire up a packet sniffer and see whenever the packet transmitted will be full or not; this might give you some insight whenever your bug is in your client or in your server.

In deed, my code runs actually on localhost -- something that makes the thing still more weird!
Juergen
Thanks for the packet-sniffer idea. But I fear it does not help me much, since the trouble only happens in rare cases and I see it only after it happened ... the package sniffer is most likely to late ...
Juergen
Whether or not the data is fragmented really depends on allot of things. TCP tries to buffer a full segment / MTU before it sends in order to make efficient use of bandwidth. The most common MTU you'll run into is around 1500 bytes for Ethernet. If you're transmitting less than this then it's unlikely that fragmentation is your problem. Your bug is probably elsewhere.
Robert S. Barnes
Just run the packet sniffer all the time. Store the bytes to disk and examine them later.
Nelson
TCP provides a stream of bytes, it is not message oriented, and does not provide boundaries. You have to treat is as such. One write calls could take several read calls to get that data. Data from several write calls could be read by one read call. And anything inbetween.
nos
@nos: Thanks for the answer that is the first definitive one. That is what I was waiting for -- somebody who can explain a little more about backgrounds. I would now exclude a problem on the sender side but drop my assumption.
Juergen
+2  A: 

From the Linux man page of recv http://linux.about.com/library/cmd/blcmdl2_recv.htm:

The receive calls normally return any data available, up to the requested amount, rather than waiting for receipt of the full amount requested.

So, if your sender is still transmitting bytes, the call will only give what has been transmitted so far.

buster
But it is just unreasonable to transmit 1-3 bytes, when the sender transmitted maybe 10 bytes in one *single* fresh call. Why should any library do this??
Juergen
It's a syscall that has to consider every case that may happen.You are doing it wrong when you rely on it to get every data that you send in one call.What should recv() do when you send 1000 bytes and the buffer only has 500 bytes?What to do when thing get retransmitted due to errors, fragmented, etc.It's common to read into a buffer until you send a terminating string (\0 is common), so that you know that one message was received.
buster
I don't say, that you might not be right. What I say is, that no reasonable explaination is available to me. Your new examples in the comment lead into the wrong direction, since it is about the first 4 bytes in a new transmission block.
Juergen
Again: "The receive calls normally return any data available[...]."You are doing it wrong, and you assume things of a function that the functions _clearly_ doesn't provide so. It says explicitly in the manpage.You could as well question why your car doesn't fly.Now, i could answer that question technically, but i'm no kernel dev, so i couldn't tell you what the technical reason for your question is.Nevertheless you expect behavior of recv(), that the manpage explicitly excludes.
buster
A: 

If the sender sends 515 bytes, and your BUFSIZE is 512, then the first recv will return 512 bytes, and the next will return 3 bytes... Could this be what's happening?

(This is just one case amongst many which will result in a 3-byte recv from a larger send...)

Stobor
I wanted to make clear, that my problem happens right at the beginning of a freshly established connection.
Juergen
+1 to compensate the unexplained, unjustified downvote, as the answer is partial but correct.
Alex Martelli
@Alex: Again -- I explained my problem really very detailled and the posters just ignored it. Are you feal being the sherif here? I made my statements clear and when I downvote then it is not because the answer is partial, but because it is (absolutely) not helpful and/or ignoring the question. I could also give a lecture about music here.
Juergen
I'm surprised anyone's helping you at all with your attitude.
Glenn Maynard
I'm surprised about the attitude in this forum here. When somebody says that an answer was not helpful (that is how it reads in my browser when I hover over the downvote-button), everybody is hurt and dislikes him -- because he seems not to be "polite" enough. Also I got at least two or three downvotes for my question since -- one I would guess from Alex (since the time match -- before my replies to him). And I should be quiet?? I am surprised, that I am still in this curious forum here.
Juergen
@Juergen: Don't worry, it balances out in the long run. I didn't think your question was inappropriate, and I had seen the downvote but didn't contest it, because you correctly pointed out that I hadn't answered your question. I had taken your comment on board, and was looking for anything which might be helpful to revise my answer. There are a few people around who don't like seeing "helpful answers which don't apply in this case" voted down, because it might discourage new people from trying to help further if they get voted down early. Not everyone shares this view, so don't be discouraged.
Stobor
For the record, the consensus around here seems to be to leave a comment explaining why the answer wasn't helpful, and downvote the answer toward 0, but not lower... Negative scores seem to mainly be given for answers which contain incorrect information, are antagonistic or argumentative, or seem to be spammy and unhelpful. (The idea being to give the impression of "thanks for trying to be helpful, even if it didn't work this time", vs "you're not playing nice, go away".)
Stobor
@Stopor: Thanks for your answer. It could be, that I downvoted below 0 -- I appologize for that -- because I did not know the consensus. I was playing "by the letter" and that read in my case "the answer is not helpful" what it was for me.
Juergen
@Juergen: Sure, I understand, and I didn't take any offense. There are no hard-and-fast rules here, everyone's just doing what they think is the best thing at the time. I don't necessarily vote according to those guidelines, either - they're just the pattern I've noticed.
Stobor
+7  A: 

I assume you're using TCP. TCP is a stream based protocol with no idea of packets or message boundaries.

This means when you do a read you may get less bytes than you request. If your data is 128k for example you may only get 24k on your first read requiring you to read again to get the rest of the data.

For an example in C:

int read_data(int sock, int size, unsigned char *buf) {
   int bytes_read = 0, len = 0;
   while (bytes_read < size && 
         ((len = recv(sock, buf + bytes_read,size-bytes_read, 0)) > 0)) {
       bytes_read += len;
   }
   if (len == 0 || len < 0) doerror();
   return bytes_read;
}
Robert S. Barnes
Excuse me? Why was this marked down?
Robert S. Barnes
Because it seams to me, that you did only read the headline. I described a very special case and you gave a generic answer. To put it in fewer words: It was not helpful for me!
Juergen
1 to compensate the unexplained, unjustified downvote, as the answer is correct (may or may not make the OP happy, but there's nothing wrong with it per se).
Alex Martelli
@Alex: How would you feel, when you asked me about lung cancer and I gave you a lecture about vitamines?
Juergen
As others have noted, if you're not calling recv in a loop your code is wrong. Writing correct networking code on the recv side *will* be helpful to the extent that it helps you narrow down / isolate where the problem might be.
Robert S. Barnes
+1, too.How would you feel if you gave a right solution and you are ignored because the one who asked the question doesn't get the problem?The while loop is a perfectly right example for receiving bytes on the network. You just don't want to rethink your own solution.
buster
I'm still going to try and help you out by explaining this a bit more. You have a situation where you claim the sending side sends >=4 bytes, yet you only recv < 4 bytes. At the moment you don't know whether the problem is on the recv side, the send side or somewhere in between. By calling recv in a loop you can definitively determine if the problem is on the recving side. For example, if you call recv in a loop, get 3 bytes, and then the next iteration of the loop hangs then you can determine that the 4th byte is not being recv'ed *likely* because there is a bug on the send side.
Robert S. Barnes
@juergen The specifics are not that interesting, since there is no way to control how the OS handles your data in any meaningful way. And there are way, way too many external variables inflencing how the OS does handle that data. You have to follow the general rules to make network code work. It is far, far more involved than the typical tcp example code you find by googling.
nos
@nos: As I said before -- the best answer yet.
Juergen
+3  A: 

This is just the way TCP works. You aren't going to get all of your data at once. There are just too many timing issues between sender and receiver including the senders operating system, NIC, routers, switches, the wires themselves, the receivers NIC, OS, etc. There are buffers in the hardware, and in the OS.

You can't assume that the TCP network is the same as a OS pipe. With the pipe, it's all software so there's no cost in delivering the whole message at once for most messages. With the network, you have to assume there will be timing issues, even in a simple network.

That's why recv() can't give you all the data at once, it may just not be available, even if everything is working right. Normally, you will call recv() and catch the output. That should tell you how many bytes you've received. If it's less than you expect, you need to keep calling recv() (as has been suggested) until you get the correct number of bytes. Be aware that in most cases, recv() returns -1 on error, so check for that and check your documentation for ERRNO values. EAGAIN in particular seems to cause people problems. You can read about it on the internet for details, but if I recall, it means that no data is available at the moment and you should try again.

Also, it sounds like from your post that you're sure the sender is sending the data you need sent, but just to be complete, check this: http://beej.us/guide/bgnet/output/html/multipage/advanced.html#sendall

You should be doing something similar on the recv() end to handle partial receives. If you have a fixed packet size, you should read until you get the amount of data you expect. If you have a variable packet size, you should read until you have the header that tells you how much data you send(), then read that much more data.

Sam Hoice
+4  A: 

The simple answer to your question, "Read from socket: Is it guaranteed to at least get x bytes?", is no. Look at the doc strings for these socket methods:

>>> import socket
>>> s = socket.socket()
>>> print s.recv.__doc__
recv(buffersize[, flags]) -> data

Receive up to buffersize bytes from the socket.  For the optional flags
argument, see the Unix manual.  When no data is available, block until
at least one byte is available or until the remote end is closed.  When
the remote end is closed and all data is read, return the empty string.
>>> 
>>> print s.settimeout.__doc__
settimeout(timeout)

Set a timeout on socket operations.  'timeout' can be a float,
giving in seconds, or None.  Setting a timeout of None disables
the timeout feature and is equivalent to setblocking(1).
Setting a timeout of zero is the same as setblocking(0).
>>> 
>>> print s.setblocking.__doc__
setblocking(flag)

Set the socket to blocking (flag is true) or non-blocking (false).
setblocking(True) is equivalent to settimeout(None);
setblocking(False) is equivalent to settimeout(0.0).

From this it is clear that recv() is not required to return as many bytes as you asked for. Also, because you are calling settimeout(10.0), it is possible that some, but not all, data is received near the expiration time for the recv(). In that case recv() will return what it has read - which will be less than you asked for (but consistenty < 4 bytes does seem unlikely).

You mention datagram in your question which implies that you are using (connectionless) UDP sockets (not TCP). The distinction is described here. The posted code does not show socket creation so we can only guess here, however, this detail can be important. It may help if you could post a more complete sample of your code.

If the problem is reproducible you could disable the timeout (which incidentally you do not seem to be handling) and see if that fixes the problem.

mhawke
The timeout is not the problem here -- I also checked that. In case of a timeout an exception would fire -- and hence a different reaction would occur in my program. I now changed my program, such as that a loop is involved also at the beginning. But I am still curious which of the answers was most helpful. Yours and the comments of nos seem to me best yet.
Juergen
@Juergen: a timeout exception will only be raised if **no** data was received. The condition that I describe is when some data, less than that requested by `recv()`, is received just before the timeout. `recv()` will return when the timeout expires and return less data than that requested. As I said, it's very unlikely that this would be consistently < 4 bytes, but I suppose it's possible. Reading in a loop is the correct way to handle your problem.
mhawke
+1  A: 

If you are still interested,

patterns like this

4 bytes are needed here ......# read remainder of datagram...

may create the silly window thing.

check this out http://tangentsoft.net/wskfaq/intermediate.html#silly-window

vmf229