views:

206

answers:

3

I've got a PC program receiving data from 20 custom hardware boxes via UDP. Each of these boxes continually sends UDP messages to a single UDP socket on the PC. The messages all contain 10 - 150 bytes of data, and each unit sends about 20 messages in 12 seconds.

Testing shows that some messages are being missed by the PC. Fewer boxes on the network results in fewer missed messages.

The long term solution is to buffer data in the hardware, and let the PC retrieve data as required via TCP, but I need to solve/minimise the missing message problem in the short term until that solution can be deployed. Ideas include: - upgrading the PC - filtering out unnecessary messages before transmission - combining separate UDP messages in the hardware into a single bigger one - using multiple sockets in the PC to receive messages

I'm looking for feedback on these ideas, plus any we might have missed.

The receiving program is a C++Builder program running Indy9.

+1  A: 

dropped messages in UDP have to do with the NET congestion on your network. It has nothing to do if you use 1 or 5 sockets to receive the packages.

Also, if you only have 20 boxes, sending 20 messages in 12 seconds. This would mean only 33 messages per second. This is really peanuts... for the network, as well as the processor.

So, the only reason packages are dropped is that there is much other network load going on. Network cards and routers have a tendency to prefer tcp-ip packages above udp.

If there is no excessive network load going on, UDP packages should not be dropped.

Toad
There is no other traffic on the network - it is a private lan.I agree that the network load is not high - though it is 'bursty', with a bundle of 12 messages sent on the 12s boundary. My guess is that messages bursts from multiple boxes are occurring together, and not being seen by the PC as a result.
IanH
My guess is that they are not (it would be very difficult to synchronize this). I would investigate network settings, switches/hubs etc. On what kind of platform are these machines running?
Toad
These boxes are custom hardware, running the LWiP stack. The only other devices on the network are the switch (unmanaged 10/100) and the PC. Synchronisation may be an artifact of simultaneous switch-on of the 20 boxes.
IanH
since the bursts are really small and nothing else is happening, a typical queue will not throw away any of the packages. So it might the lwip stack which needs some extra configuring (increase buffer sizes?) or your switch.
Toad
We've upped the buffering in the box. The switch is unmanaged.
IanH
+1  A: 

The most likely problem is on the network.

You've already said the network is private LAN, which eliminates one of my suggestions, the other is to make sure the network is switched - a hub could easily lose packets.

If that doesn't help then you could try increasing the receive buffer size on the PC. http://www.developerweb.net/forum/showthread.php?t=5773

Douglas Leeder
The network is definitely switched.Will investigate the rx buffer size.
IanH
A: 

Thanks to the other guys for suggestions, but for anybody referring back, the solution here was replacing the old PC that had been used (1.6GHz Celeron with 512MB RAM) with the correct spec machine (2GHz Core 2 processor with 2GB RAM).

IanH