views:

84

answers:

3

I've read advice in many places to the effect that sending a lot of small packets will lead to network congestion. I've even experienced this with a recent multi-threaded tcp app I wrote. However, I don't know if I understand the exact mechanism by which this occurs.

My initial guess is that if the MTU of the physical transmission media is fixed, and you send a bunch of small packets, then each packet may potential take up an entire transmission frame on the physical media.

For example, my understanding is that even though Ethernet supports variable frames most equipment uses a fixed Ethernet frame of 1500 bytes. At 100 Mbit, a 1500 byte frame "goes by" on the wire every 0.12 milliseconds. If I transmit a 1 byte message ( plus tcp & ip headers ) every 0.12 milliseconds I will effectively saturate the 100Mb Ethernet connection with 8333 bytes of user data.

Is this a correct understanding of how tinygrams cause network congestion?

Do I have all my terminology correct?

+2  A: 

A TCP Packet transmitted over a link will have something like 40 bytes of header information. Therefore If you break a transmission into 100 1 byte packets, each packet sent will have 40 bytes, so about 98% of the resources used for transmission are overhead. If instead, you send it as one 100 byte packet, the total transmitted data is only 140 bytes, so only 28% overhead. In both cases you've transmitted 100 bytes of payload over the network, but in one you used 140 bytes of network resources to accomplish it, and in the other you've used 4000 bytes. In addition, it take more resources on the intermediate routers to correctly route 100 41 byte payloads, than 1 40 byte payloads. Routing 1 byte packets is pretty much the worst case scenerio for the routers performancewise, so they will generally exhibit their worst case performance under this situation.

In addition, especially with TCP, as performance degrades due to small packets, the machines can try do do things to compensate (like retransmit) that will actually make things worse, hence the use of Nagles algorithm to try to avoid this.

bdk
+4  A: 

In wired ethernet at least, there is no "synchronous clock" that times the beginning of every frame. There is a minimum frame size, but it's more like 64 bytes instead of 1500. There are also minimum gaps between frames, but that might only apply to shared-access networks (ATM and modern ethernet is switched, not shared-access). It is the maximum size that is limited to 1500 bytes on virtually all ethernet equipment.

But the smaller your packets get, the higher the ratio of framing headers to data. Eventually you are spending 40-50 bytes of overhead for a single byte. And more for its acknowledgement.

If you could just hold for a moment and collect another byte to send in that packet, you have doubled your network efficiency. (this is the reason for Nagle's Algorithm)

There is a tradeoff on a channel with errors, because the longer frame you send, the more likely it experience an error and will have to be retransmitted. Newer wireless standards load up the frame with forward error correction bits to avoid retransmissions.

The classic example of "tinygrams" is 10,000 users all sitting on a campus network, typing into their terminal session. Every keystroke produces a single packet (and acknowledgement).... At a typing rate of 4 keystrokes per second, That's 80,000 packets per second just to move 40 kbytes per second. On a "classic" 10mbit shared-medium ethernet, this is impossible to achive, because you can only send 27k minimum sized packets in one second - excluding the effect of collisions:

   96 bits inter-frame gap 
+  64 bits preamble 
+ 112 bits ethernet header 
+  32 bits trailer 
-----------------------------
= 304 bits overhead per ethernet frame.
+   8 bits of data (this doesn't even include IP or TCP headers!!!)
----------------------------
= 368 bits per tinygram

10000000 bits/s ÷ 368 bits/packet = 27172 Packets/second.

Perhaps a better way to state this is that an ethernet that is maxed out moving tinygrams can only move 216kbits/s across a 10mbit/s medium for an efficiency of 2.16%

Joe Koberg
So my 54Mbit 802.11g effectively would have a bandwidth of approximately 1 mbit ( not including tcp/ip headers ) when flooded with tinygrams. So if I'm downloading lots of small files, and setting up / tearing down a TCP connection for every file, I assume that this would have a similar effect?
Robert S. Barnes
If the files are merely a few bytes each, that's certainly possible..
Joe Koberg
The 802.11 WLAN frames are different as well, so the overhead calculation above doesn't hold. Also the stations have more "smarts" than the classic ethernet station, and coordinate with the AP to reduce collisions... Microsoft has a very nice overview at http://technet.microsoft.com/en-us/library/cc757419%28WS.10%29.aspx (look under "802.11 MAC Frame")
Joe Koberg
@Joe Koberg: Do you happen to know what protocol is used on cable networks which use direct DHCP connections? Is there such a thing as Ethernet over cable, or do they use something like ATM?
Robert S. Barnes
The DOCSIS standards define this MAC layer. Cable modems transmit full Ethernet frames with additional overhead, so the tinygram issue is present. Each packet on the wire contains at least 6 and as many as 246 additional bytes up-front for DOCSIS purposes.Cable modems use a time- or code-divided uplink channel - so you can't crowd out your neighbor, and only have the single headend talking on the downstream - so there are few collisions.But with the cable modem being a transparent IP bridge, and the whole system being engineered for traffic control, it could be reassembling short packets.
Joe Koberg
+1  A: 

BDK has about half the answer (+1 for him). A large part of the problem is that every message comes with 40 bytes of overhead. Its actually a little worse than that though.

Another issue is that there is actually minimum packet size specified by IP. (This is not MTU. MTU is a *M*aximum before it will start fragmenting. Different issue entirely) The minimum is pretty small (I think 46 bytes, including your 24 byte TCP header), but if you don't use that much, it still sends that much.

Another issue is protocol overhead. Each packet sent by TCP causes an ACK packet to be sent back by the recipient as part of the protocol.

The result is that is you do something silly, like send one TCP packet every time the user hits a key, you could easily end up with a tremendous amount of wasted overhead data floating around.

T.E.D.