views:

712

answers:

13

Should log classes open/close a log file stream on each write to the log file or should it keep the log file stream open throughout the application's lifetime until all logging is complete?

I'm asking in context of a desktop application. I have seen people do it both ways and was wondering which approach yields the best all-around results for a logger.

+7  A: 

If you have frequent read/writes it is more efficient to keep the file open for the lifetime with a single open/close.

You might want to flush periodically or after each write though in case your application crashes you might not have all the data written to your file. fflush on unix based systems and FlushFileBuffers on windows.

If you are running on windows as well you can use the CreateFile API with FILE_FLAG_NO_BUFFERING to go direct to file on each write.

It is also better to keep the file open for the lifetime, because each time you open/close you might have a failure if the file is in use. For example you might have a backup application that runs and open/closes your file as it's backing it up. And this might cause your program to not be able to access your own file. Ideally you would want to keep your file open always and specify sharing flags on windows (FILE_SHARE_READ). On unix based systems sharing will be default.

Brian R. Bondy
A: 

Open and close. Can save you from a corrupt file in case of a system crash.

Lette
So can flushing each time.
Lev
Of course, but why bother? Performance? If you don't measure, it's just premature...
Lette
+1  A: 

I don't see any reason to close it.

On the other hand, closing and reopening takes a little extra time.

Lev
+1  A: 

I can think of a couple reasons you don't want to hold the file open:

  • If a log file is shared between several different apps, users, or app instances you could have locking issues.
  • If you're not clearing the stream buffer correctly you could lose the last few entries when that app crashes and you need them most.

On the other hand, opening files can be slow, even in append mode. In the end, it comes to down to what your app is doing.

Joel Coehoorn
A: 

The advantage to closing the file every time is that the OS will guarantee that the new message is written to disk. If your leave the file open and your program crashes, it is possible the entire thing wouldn't be written. You could also accomplish the same thing by doing an fflush() or whatever is the equivalent in the language you are using.

bobwienholt
+3  A: 

I would tend to leave them open -- but open them with the file share permissions set to allow other readers and make sure you flush log output with every message.

I hate programs which don't even let you look at the logfile while they are running, or where the log file isn't flushed and lags behind what is happening.

Rob Walker
+1  A: 

It's generally better to keep them open.

If you're concerned about being able to read them from another process, you need make sure that the share mode you use to open/create them allows others to read them (but not write to them, obviously).

If you're worried about losing data in the event of a crash, you should periodically flush/commit their buffers.

Ferruccio
A: 

As a user of your application I'd prefer it to not hold files open unless it's a real requirement of the app. Just one more thing that can go wrong in the event of a system crash, etc.

marc
Of course a derivative requirement is often performance.
Chris Noe
+2  A: 

It's a tradeoff. Opening and closing the file each time makes it more likely that the file will be updated on disk in the program crashes. On the other hand, there's some overhead involved in opening the file, seeking to the end, and appending data to it.

On Windows, you won't be able to move/rename/delete the file while it's open, so open/write/close might be helpful for a long-running process where you might occasionally want to archive the old log contents without interrupting the writer.

In most of the cases where I've done this sort of logging, I've kept the file open, and used fflush() to make it more likely the file was up-to-date if the program crashed.

Mark Bessey
+4  A: 

For performance, keep open. For safety, flush often.

This will mean that the run-time library will not try to buffer writes until it has lots of data -- you may crash before that's written!

Ray Hayes
A: 

I would open and close on each write (or batch of writes). If doing this causes a performance problem in a desktop application, it's possible you're writing to the log file too often (although I'm sure there can be legitimate reasons for lots of writes).

MusiGenesis
A: 

For large intensive applications, what I usually do is I keep the log file open for the duration of the application and have a separate thread that flushes log content in the memory to HDD periodically. File open and close operation require system calls, which is a lot of work if you look into lower level.

Alvin
+4  A: 

In general, as everyone else said, keep the file open for performance (open is a relatively slow operation). However, you need to think about what's going to happen if you keep the file open and people either remove the log file or truncate it. And that depends on the flags used at open time. (I'm addressing Unix - similar considerations probably apply to Windows, but I'll accept correction by those more knowledgeable than me).

If someone sees the log file grow to, say, 1 MB and then removes it, the application will be none the wiser, and Unix will keep the log data safe until the log is closed by the application. What's more, the users will be confused because they probably created a new log file with the same name as the old and are puzzled about why the application 'stopped logging'. Of course, it didn't; it is just logging to the old file that no-one else can get at.

If someone notices that the log file growing to, say, 1 MB and then truncates it, the application will also be none the wiser. Depending on how the log file was opened, though, you might get weird results. If the file was not opened with O_APPEND (POSIX-speak), then the program will continue to write at its current offset in the log file, and the first 1 MB of the file will appear as a stream of zero bytes -- which is apt to confuse programs looking at the file.

How to avoid these problems?

  • Open the log file with O_APPEND.
  • Periodically use fstat() on the file descriptor and check whether st_nlink is zero.

If the link count goes to zero, somebody removed your log file. Time to close it, and reopen a new one. By comparison with stat() or open(), fstat() should be quick; it is basically copying information directly out of stuff that is already in memory, no name lookup needed. So, you should probably do that every time you are about to write.

Suggestions:

  • Make sure there is a mechanism to tell the program to switch logs.
  • Make sure you log the complete date and time in the messages.

I suffer from an application that puts out time and not date. Earlier today, I had a message file that had some entries from 17th August (one of the messages accidentally included the date in the message after the time), and then some entries from today, but I can only tell that because I created them. If I looked at the log file in a weeks time, I could not tell which day they were created (though I would know the time when they were created). That sort of thing is annoying.

You might also look at what systems such as Apache do - they have mechanisms for handling log files and there are tools for dealing with log rotation. Note: if the application does keep a single file open, does not use append mode, and does not plan for log rotation or size limits, then there's not much you can do about log files growing or having hunks of zeroes at the start -- other than restarting the application periodically.

Incidentally, the disk blocks that 'contain' the zeroes are not actually allocated on disk. You can really screw up people's backup strategies by creating files which are a few GB each, but all except the very last disk block contain just zeroes. Basically (error checking and file name generation omitted for conciseness):

int fd = open("/some/file", O_WRITE|O_CREATE, 0444);
lseek(fd, 1024L * 1024L * 1024L, 0);
write(fd, "hi", 2);
close(fd);

This occupies one disk block on the disk - but 1 GB (and change) on (uncompressed) backup and 1 GB (and change) when restored. Anti-social, but possible.

Jonathan Leffler