views:

71

answers:

2

I have problem with missing messages when using nonblocking read in udp between two hosts. The sender is on linux and the reader is on winxp. This example in python shows the problem.
Here are three scripts used to show the problem.
send.py:

import socket, sys
s = socket.socket(socket.AF_INET,socket.SOCK_DGRAM)
host = sys.argv[1]
s.sendto('A'*10,   (host,8888))
s.sendto('B'*9000, (host,8888))
s.sendto('C'*9000, (host,8888))
s.sendto('D'*10,   (host,8888))
s.sendto('E'*9000, (host,8888))
s.sendto('F'*9000, (host,8888))
s.sendto('G'*10,   (host,8888))

read.py

import socket
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.bind(('',8888))
while True:
    data,address = s.recvfrom(10000)
    print "recv:", data[0],"times",len(data) 

read_nb.py

import socket
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.bind(('',8888))
s.setblocking(0)
data =''
address = ''
while True:
    try:
        data,address = s.recvfrom(10000)
    except socket.error:
        pass
    else: 
        print "recv:", data[0],"times",len(data) 

Example 1 (works ok):

ubuntu> python send.py
winxp > read.py

give this ok result from read.py:

recv: A times 10
recv: B times 9000
recv: C times 9000
recv: D times 10
recv: E times 9000
recv: F times 9000
recv: G times 10

Example 2 (missing messages):
in this case the short messages will often not be catched by read_nb.py I give two examples of how it can look like.

ubuntu> python send.py
winxp > read_nb.py

give this result from read_nb.py:

recv: A times 10
recv: B times 9000
recv: C times 9000
recv: D times 10
recv: E times 9000
recv: F times 9000

above is the last 10 byte message missing

below is a 10 byte message in the middle missing

recv: A times 10
recv: B times 9000
recv: C times 9000
recv: E times 9000
recv: F times 9000
recv: G times 10

I have checked with wireshark on windows and every time all messages is captured so they reach the host interface but is not captured by read_nb.py. What is the explanation?

I have also tried with read_nb.py on linux and send.py on windows and then it works. So I figure that this problem has something to do with winsock2

Or maybe I am using nonblocking udp the wrong way?

+2  A: 

If the datagrams are getting to the host (as your wireshark log shows) then the first place I'd look is the size of your socket recv buffer, make it as big as you can, and run as fast as you can.

Of course this is completely expected with UDP. You should assume that datagrams can be thrown away at any point and for any reason. Also you may get datagrams more than once...

If you need reliability then you need to build your own, or use TCP.

Len Holgate
Here's +1 and justice for you :)
Nikolai N Fetissov
+2  A: 

Losing messages is normal with UDP - the transport layer does not guarantee order or delivery of datagrams. If you want them in order and/or always delivered, switch to TCP or implement sequencing and/or ack/timeout/retransmission yourself.

To your example - the large messages are larger then normal ethernet MTU of 1500 minus eight bytes of UDP header (unless you are using jumbo frames) and thus will be fragmented by the sender. This puts more load onto both sender and receiver, but more on the receiver since it needs to keep fragments in kernel memory until the full datagram arrives.

I doubt you are overflowing the receive buffer with 36030 bytes, but then I never do networking on Windows, so you better check the value of SO_RECVBUF socket option on the receiver as @Len suggests.

Also check the output of netstat -s to see the dropped packet counts.

Nikolai N Fetissov
windows: s.getsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF)=8192 bytes. ubuntu : buf=112640 bytes
lgwest
OK, here was the problem. Good to know.
Nikolai N Fetissov
Hmm, so I suggest the solution and Nikolai gets the accepted answer, there's no justice :(
Len Holgate
You are right about that, sorry. But this answer was more usefull to me because it gave the option to use (almost). I wasn't aware of SO_RCVBUF before.
lgwest
Setting the buffers size bigger than 64k doesn't work on windows.
lgwest
Oh yes it does. It doesn't propagate to the TCP window size unless you do it before connecting, but that goes for all platforms, and we are discussing UDP which doesn't have windowing.
EJP