ansaurus

Question

Answer 1

+1 A:

Sometimes its just better to call shell function than to reimplement functionality. As Alan says you can use CAT on unix systems or perhaps on windows you can use the built in command processor

copy file1+file2+file3 concated_file

Preet Sangha 2010-06-01 03:06:34

thanks preet sangha ... you're right as well ... there's an overhead in creating the new shell process/thread but ultimately it should be more efficient than anything i could possibly do .. i will try to implement and profile

lboregard 2010-06-01 03:17:16

The important thing is what you said last - profile!

Preet Sangha 2010-06-01 03:28:17

Answer 2

A:

You can use smaller fixed-size buffers like so:

byte[] bytes = new byte[8192]; // adjust this as needed
int bytesRead;
do {
    bytesRead = fsIn.Read(bytes, 0, bytes.Length);
    fsOut.Write(bytes, 0, bytesRead);
} while (bytesRead > 0);

This is pretty self explanatory except for during the last block so basically what's happening is I'm passing an 8K byte array to the Read method which returns the number of bytes it actually read. So on the Write call, I am passing that value which is somewhere between 0 and 8192. In other words, on the last block, even though I am passing a byte array of 8192 bytes, bytesRead might only be 10 in which case only the first 10 bytes need to be written.

EDIT

I edited my answer to do this in a slightly different way. Instead of using the input file's position to determine when to break out of the loop, I am checking to see if bytesRead is greater than zero. This method works with any kind of stream to stream copy including streams that don't have a fixed or known length.

Josh Einstein 2010-06-01 03:08:25

thanks Josh ... i will implement this and profile the results

lboregard 2010-06-01 03:16:03

Note that like anything, there's always a tradeoff between performance and memory usage. The fastest way to do this is the way you originally showed it... using as much memory as required. Also note that unless you use specific constructors, the default buffering done by .NET will still take place. And the file system will still apply its own write buffering. But at least this allows you to process huge files without fear of OutOfMemoryException.

Josh Einstein 2010-06-01 03:22:16

I would use bigger buffer, like 1Mb for typical disk. The back of an envelop way to calculate buffer size you need to use disk efficiently for non-SSD disk: multiply disk seek time by throughput. For modern disks, you get (~10ms)*(~100Mb/sec) = 1MB. 8KB might noticeably slow you down.

Michael 2010-06-01 04:40:12

Completely agree with Michael that chunking 8K at a time you will likely cause bottlenecks. I should have clarified that it was just an example number but the best buffer size will depend on factors such as how many of these you'll be doing at a time (is it a web server?) and the average size of the file.

Josh Einstein 2010-06-01 04:52:05

ansaurus

tags:

views:

answers:

Memory Efficient file append

related questions