views:

12036

answers:

8

I need to send packets from one host to another over a potentially lossy network. In order to minimize packet latency, I'm not considering TCP/IP. But, I wish to maximize the throughput uisng UDP. What should be the optimal size of UDP packet to use?

Here are some of my considerations:

  • The MTU size of the switches in the network is 1500. If I use a large packet, for example 8192, this will cause fragmentation. Loss of one fragment will result in the loss of the entire packet, right?

  • If I use smaller packets, I'll incur the overhead of the UDP and IP header

  • If I use a really large packet, what is the largest that I can use? I read that the largest datagram size is 65507. What is the buffer size I should use to allow me to send such sizes? Would that help to bump up my throughput?

  • What are the typical maximum datagram size supported by the common OSes (eg. Windows, Linux, etc.)?

Updated:

Some of the receivers of the data are embedded systems for which TCP/IP stack is not implemented.

I know that this place is filled with people who are very adament about using what's available. But I hope to have better answers than just focusing on MTU alone.

+1  A: 

Even though the MTU at the switch is 1500, you can have situations (like tunneling through a VPN) that wrap a few extra headers around the packet- you may do better to reduce them slightly, and go at 1450 or so.

Can you simulate the network and test performance with different packet sizes?

Tim Howland
+6  A: 

The best way to find the ideal datagram size is to do exactly what TCP itself does to find the ideal packet size: Path MTU discovery.

TCP also has a widely used option where both sides tell the other what their MSS (basically, MTU minus headers) is.

CesarB
Would discovering the MTU give me the best datagram performance?
sep
@sep61: if it did not, there would be no reason for TCP to use PMTUD. Discovering the PMTU has a cost, but the implementors of TCP felt the benefits justfied the costs.
CesarB
Downside to PMTUD is when overzealous and underinformed sysadmins break it by disabling ALL ICMP on their firewalls.
Brian Knoblauch
+8  A: 

Alternative answer: be careful to not reinvent the wheel.

TCP is the product of decades of networking experience. There is a reson for every or almost every thing it does. It has several algorithms most people do not think about often (congestion control, retransmission, buffer management, dealing with reordered packets, and so on).

If you start reimplementing all the TCP algorithms, you risk ending up with an (paraphasing Greenspun's Tenth Rule) "ad hoc, informally-specified, bug-ridden, slow implementation of TCP".

If you have not done so yet, it could be a good idea to look at some recent alternatives to TCP/UDP, like SCTP or DCCP. They were designed for niches where neither TCP nor UDP was a good match, precisely to allow people to use an already "debugged" protocol instead of reinventing the wheel for every new application.

CesarB
Related question with other alternatives in the answers: http://stackoverflow.com/questions/107668/what-do-you-use-when-you-need-reliable-udp
CesarB
+1  A: 

IP header is >= 20 bytes but mostly 20 and UDP header is 8 bytes. This leaves you 1500 - 28 = 1472 bytes for you data. PATH MTU discovery finds the smallest possible MTU on the way to destination. But this does not necessarily mean that, when you use the smallest MTU, you will get the best possible performance. I think the best way is to do a benchmark. Or maybe you should not care about the smallest MTU on the way at all. A network device may very well use a small MTU and also transfer packets very fast. And its value may very well change in the future. So you can not discover this and save it somewhere to use later on, you have to do it periodically. If I were you, I would set the MTU to something like 1440 and benchmark the application...

Malkocoglu
+3  A: 

Another thing to consider is that some network devices don't handle fragmentation very well. We've seen many routers that drop fragmented UDP packets or packets that are too big. The suggestion by CesarB to use Path MTU is a good one.

Maximum throughput is not driven only by the packet size (though this contributes of course). Minimizing latency and maximizing throughput are often at odds with one other. In TCP you have the Nagle algorithm which is designed (in part) to increase overall throughput. However, some protocols (e.g., telnet) often disable Nagle (i.e., set the No Delay bit) in order to improve latency.

Do you have some real time constraints for the data? Streaming audio is different than pushing non-realtime data (e.g., logging information) as the former benefits more from low latency while the latter benefits from increased throughput and perhaps reliability. Are there reliability requirements? If you can't miss packets and have to have a protocol to request retransmission, this will reduce overall throughput.

There are a myriad of other factors that go into this and (as was suggested in another response) at some point you get a bad implementation of TCP. That being said, if you want to achieve low latency and can tolerate loss using UDP with an overall packet size set to the PATH MTU (be sure to set the payload size to account for headers) is likely the optimal solution (esp. if you can ensure that UDP can get from one end to the other.

dpp
A: 

The "Stack" is (TCP uses(UDP uses(IPv4 uses (ETHERNET))))... or The "Stack" is (TCP uses(UDP uses(IPv6 uses (ETHERNET))))...

All those headers are added in TCP. IPv6 is just dumb. Every computer does not require its own IP. IPv6 is just undesired packet bloat. You have 65,000+ ports, you will not use them all, ever... Add that to the individual machine MAC address in the ETHERNET header, and you have gazillions of addresses.

Focus on the (UDP uses(IPv4 uses(ETHERNET))) headers, and all will be fine. Your program should be able to "Check" packet size, by receiving a 65,000 byte buffer over UDP, set as all NULL CHR(0), and sending a 65,000 packet of CHR(255) bytes. You can see if your UDP data was lost, because you will never get it. It will be cut short. UDP does not transmit multiple packets. You send one, you get one. You just get less if it can't fit. Or you get nothing, if it gets dropped.

TCP will hold your connections in purgatory until all data is received. It is using UDP packets, and telling the other computer to resend those missing packets. That comes with additional overhead, and causes LAG if any packet is dropped, lost, short, or out of order.

UDP gives you full control. Use UDP if you send "Critical" and "Non-Critical" data, and want to use a reduced packet-order number system, that is not dependant on sequential arrival. Only use TCP for WEB or SECURE solid data, that requires persistence and 100% completeness. Otherwise, you are just wasting our web-bandwidth, and adding bloated clutter to the net. The smaller your data-stream, the less you will loose along the way. Use TCP, and you will guarantee additional LAG related to all the resending, and bloated headers that are added onto the TCP header, for "Flow control".

Seriously, flow control is not that hard to manage, nor is priority, and missing data detection. TCP offers nothing. That is why it is given away for free. It is not seasoned, it is just blindly stupid and easy. It is an old pair of flip-flops. You need a good pair of sneakers. TCP was, and still is, a hack.

-1, for obvious reasons.
bortzmeyer
-1 for two big reasons: 1) TCP does not "use" UDP. 2) IPv6 is about increasing the number of addressable hosts--it has nothing to do with ports (those live at the TCP/UDP level anyway). MAC addresses are entirely irrelevant.
Drew Hall
A: 

Well, I've got a non-MTU answer for you. Using a connected UDP socket should speed things up for you. There are two reasons to call connect on your UDP socket. The first is efficiency. When you call sendto on an unconnected UDP socket what happens is that the kernel temporarily connects the socket, sends the data and then disconnects it. I read about a study indicating that this takes up nearly 30% of processing time when sending. The other reason to call connect is so that you can get ICMP error messages. On an unconnected UDP socket the kernel doesn't know what application to deliver ICMP errors to and so they just get discarded.

Robert S. Barnes
A: 

Uhh Jason, TCP does not use UDP. TCP uses IP, which is why you often see it referred to as TCP/IP. UDP also uses IP, so UDP is technically UDP/IP. The IP layer handles the transfer of data from end to end (across different networks), which is why it is called the Inter-networking Protocol. TCP and UDP handle the segmentation of the data itself. The lower layers such as Ethernet or PPP or whatever else you use handle computer-to-computer data transfer (that is, within a single network).

qwerty_ca
This is best done as a comment on Jason's answer. As it is, it's above Jason's answer, since that answer got some well-deserved downvotes, which can cause confusion. It also isn't anything like an answer to the original question, and therefore shouldn't be an independent answer.
David Thornley
Oh OK, thanks David - I'm new to this site and didn't realize you could comment on people's answers as well.
qwerty_ca