tags:

views:

607

answers:

7

After I accept() a connection, and then write() to the client socket, is it better to write all the data you intend to send at once or send it in chunks?

For example:

accept, write 1MB, disconnect

…or…

accept, write 256 bytes, write 256 bytes, … n, disconnect

My gut feeling tells me that the underlying protocol does this automatically, with error correction, etc. Is this correct, or should I be chunking my data?

Before you ask, no I'm not sure where I got the idea to chunk the data – I think it's an instinct I've picked up from programming C# web services (to get around receive buffer limits, etc, I think). Bad habit?

Note: I'm using C

+16  A: 

The client and server will break up your data as they see fit, so you can send as much as you like in one chunk. Check this out.

Jon B
+4  A: 

From a TCP level, yes your big buffer will be split up when it is too large, and it will be combined when it is too small.

From an application level, don't let your application deal with unbounded buffer sizes. At some level you need to split them up.

If you are sending a file over a socket, and perhaps processing some of this file's data, like compressing it. Then you need to split this up into chunks. Otherwise you will use too much RAM when you eventually happen upon a large file and your program will be out of RAM.

RAM is not the only problem. If your buffer gets too big, you may spend too much time reading in the data, or processing it, and you won't be using the socket that is sitting there waiting for data. For this reason it's best to have a parameter for the buffer size so that you can determine a value that is not too small, nor too big.

My claim is not that a TCP socket can't handle a big chunk of data, it can and I suggest to use bigger buffers when sending to get better efficiency. My claim is to just don't deal with unbounded buffer sizes in your application.

Brian R. Bondy
This is a good point, especially if the app will deal with "n" bytes of data at a time.
Jon B
+5  A: 

Years and years ago, I had an application that send binary data - it did one send with the size of the following buffer, and then another send with the buffer (a few hundred bytes). And after profiling, we discovered that we could get a major speed-up by making them into one buffer, and sending it just once. We were surprised - even though there is some network overhead on each packet, we didn't think that was going to be a noticeable factor.

Paul Tomblin
I have seen with countless applications that using a larger buffer size will always result in a overall higher transfer rate. I have also seen, when using a buffer size that is too big, your application will actually transfer slower because you aren't providing the data fast enough to the socket
Brian R. Bondy
@Brian, in our case the two packets merged together was usually less than one TCP/IP packet in size.
Paul Tomblin
+1  A: 

I would send all in one big chunk as the underlying layers in osi modell . Therefor you dont have to worry about how big chunks you are sending as the layers will split these up as necisarry.

ChrisAD
TCP/IP is not based on the oSI model
anon
The OSI *model*, while inaccurate, is still kind of useful for describing network protocols. TCP/IP stacks map pretty well into the OSI model up to the TCP/UDP layer. The OSI *model* shouldn't be confused with the OSI *stack*, which no one uses.
Dan Breslau
We will have to disagree on the usefulness of an inaccurate model.
anon
Umh wow have I misunderstood something very basic here? TCP found in Layer 4 in the OSI model while IP is found on layer 3. Putting these two together we get TCP/IP which is used for data transmission over most networks these days.
ChrisAD
Nope, OSI had no influence on TCP/IP - the proof being that TCP/IP actually works. The fact that you can map certain bits of TCP/IP to bits of the OSI model proves nothing.
anon
Hrm, I always assumed TCP/IP was based on the OSI model as well, according to http://en.wikipedia.org/wiki/OSI_model#Comparison_with_TCP.2FIP this is not the case. Learn something new everyday I guess..
Josh W.
+1  A: 

If you're computing the data between those writes, it may be better to stream them as they're available. Also, writing them all at once may produce buffer overruns (though that's probably rare, it does happen), meaning that your app needs to pause and re-try the writes (not all of them, just from the point where you hit the overflow.)

I wouldn't usually go out of my way to chunk the writes, especially not as small as 256 byte chunks. (Since roughly 1500 bytes can fit in an Ethernet packet after TCP/IP overhead, I'd use chunks at least that large.)

Dan Breslau
+1  A: 

The only absolute answer is to profile app in case. There are so many factors that it is not possible to give exact answer thah is correct in all cases.

Dev er dev
You meant benchmark, right?
Seun Osewa
+5  A: 

The Nagle Algorithm, which is usually enabled by default on TCP sockets, will likely combine those four 256 byte writes into the same packet. So it really doesn't matter if you send it as one write or several, it should end up in one packet anyways. Sending it as one chunk makes more sense if you have a big chunk to begin with.

Greg Rogers
In practice I've always seen a significant speed increase when using larger buffer sizes. Maybe this has to do with internal OS buffer copying though and not about TCP.
Brian R. Bondy
Well, for each send/write you perform you have to do a kernel mode switch, which IIRC is about 100 cycles. So if you do this very frequently that overhead may become an issue, and you'd rather do all the stuff you need to do in one syscall.
Greg Rogers