views:

58

answers:

2

I have been recently involved in handling the console logs for a server and I was wondering, out of curiosity, that is there a performance issue in writing to a large file as compared to small ones.

For instance is it a good idea to keep the log file size small instead of letting them grow bulky, but I was not able to argue much in favor of either approach.

There might be problems in reading or searching in the file, but right now I am more interested in knowing if writing can be affected in any way. Looking for an expert advice.

Edit: The way I thought it was that the OS only has to open a file handle and push the data to the file system. There is little correlation to the file size, since you have to keep on appending the data to the end of the file and whenever a block of data is full, OS will assign another block to the file. As I said earlier, there can be problems in reading and searching because of defragmentation of file blocks, but I could not find much difference while writing.

A: 

I am not an expert, but I will try to answer anyway.

Larger files may take longer to write on disk and in fact it is not a programming issue. It is file system issue. Perhaps there are file systems, which does not have such issues, but on Windows large files cannot be write down in one piece so fragmenting them will take time (for the simple reason that head will have to move to some other cylinder). Assuming that we are talking about "classic" hard drives...

If you want an advice, I would go for writing down smaller files and rotating them either daily or when they hit some size (or both actually). That is rather common approach I saw in an enterprise-grade products.

Paweł Dyda
The way I thought it was that the OS only has to open a file handle and push the data to the file system. There is little correlation to the file size, since you have to keep on appending the data to the end of the file and whenever a block of data is full, OS will assign another block to the file.
Ashish
+1  A: 

As a general rule, there should be no practical difference between appending a block to a small file (or writing the first block which is appending to a zero-length file) or appending a block to a large file.

There are special cases (like trying to fault in a triple-indirect block or the initial open having to read all mapping information) which could add additional I/O's. but the steady-state should be the same.

I'd be more worried about the manageability of having huge files: slow to backup, slow to copy, slow to view, etc.

MJZ