views:

123

answers:

3

I am writing a small IO library to assist with a larger (hobby) project. A part of this library performs various functions on a file, which is read / written via the FileStream object. On each StreamReader.Read(...) pass, I fire off an event which will be used in the main app to display progress information. The processing that goes on in the loop is vaired, but is not too time consuming (it could just be a simple file copy, for example, or may involve encryption...).

My main question is: What is the best memory buffer size to use? Thinking about physical disk layouts, I could pick 2k, which would cover a CD sector size and is a nice multiple of a 512 byte hard disk sector. Higher up the abstraction tree, you could go for a larger buffer which could read an entire FAT cluster at a time. I realise with today's PC's, I could go for a more memory hungry option (a couple of MiB, for example), but then I increase the time between UI updates and the user perceives a less responsive app.

As an aside, I'm eventually hoping to provide a similar interface to files hosted on FTP / HTTP servers (over a local network / fastish DSL). What would be the best memory buffer size for those (again, a "best-case" tradeoff between perceived responsiveness vs. performance).

+1  A: 

When I deal with files directlry through a stream object, I typically use 4096 bytes. It seems to be reasonably effective across multiple IO areas (local fs, LAN/SMB, network stream, etc), but I haven't profiled it or anything. Way back when, I saw several examples use that size and it stuck in my memory. That doesn't mean its the best though.

Nate Bross
Right. I wouldn't ever use anything less than 4k, since it's the smallest block managed by the virtual memory system (on which the disk cache is based).
Ben Voigt
+1  A: 

"It depends".

You would have to test your application with different buffer sizes to determine whis is best. You can't guess ahead of time.

John Saunders
+1  A: 

Files are already buffered by the file system cache. You just need to pick a buffer size that doesn't force FileStream to make the native Windows ReadFile() API call to fill the buffer too often. Don't go below a kilobyte, more than 16 KB is a waste of memory.

4KB is a traditional choice, even though that will exactly span a virtual memory page only ever by accident. It is difficult to profile, you'll end up measuring how long it takes to read a cached file, that won't happen in a production environment too often. File I/O is completely dominated by the disk drive or the NIC, copying the data is peanuts. 4KB will work fine.

Hans Passant