views:

109

answers:

3

I recently inherited some code that someone else had written.

I discovered that everywhere in the code that a directory was opened for reading, it was never closed because the original developer had a syntax problem - he was using the close function to try to close a directory handle instead of the closedir function.

The code was something like this:

opendir( DIR, $dir ) or die "Cannot open $dir: $!\n";
@files = readdir( DIR );
close( DIR );

(Which is another good point that is made in Perl Best Practices (pages 208,278) about checking the return of the close function. If the return of close were checked in this case, it would be failing with "Bad file number".)

I've since changed this to closedir, but it made me start wondering: Since the directory handle was never closed, what are the negative implications to keeping a directory handle opened for a long duration?

This program is larger (3,500 lines of code), runs for a while (5-10 minutes), and multiple instances of this program are running at the same time. In the case of this directory in the example above, $dir is the same value for all instances. If 10 instances of this program were running at the same time, they all held an open directory handle against the same directory for 5 minutes or longer. I'm sure Perl is closing the directory handle automatically when the program finishes, but best practice says to close it as soon as possible.

It is more obvious to me where leaving file handles open can cause problems (especially for file handles that are open for writing), but what bad things can happen by not closing a directory handle?

The reason I am asking is because there has been an odd circumstance where this program was trying to create a file (in the directory defined by $dir above). The filename had the PID embedded in it, so it is a smaller chance that the file could already be there, but Perl was unable to open the file for writing, because it said it already existed. When we looked in the directory, that file did not exist. I am wondering if all of the open directory handles on this directory could cause such a problem?

I'm not sure if the OS makes a difference, but this program is running on AIX.

Thanks in advance, and happy Friday!

+9  A: 

You wasted a directory descriptor - which probably counts as a file descriptor. It would ultimately hurt you if your program opened enough directories to run out of file descriptors. Otherwise, it is pretty harmless, though less than ideal. It makes the system (and Perl) keep resources around which it might otherwise be able to release.

If the directory handle was a local variable, not a plain DIR-style name, you might have Perl cleaning up behind you. See opendir which says:

Opens a directory named EXPR for processing by readdir, telldir, seekdir, rewinddir, and closedir. Returns true if successful. DIRHANDLE may be an expression whose value can be used as an indirect dirhandle, usually the real dirhandle name. If DIRHANDLE is an undefined scalar variable (or array or hash element), the variable is assigned a reference to a new anonymous dirhandle. DIRHANDLEs have their own namespace separate from FILEHANDLEs.

Jonathan Leffler
Even today some OSs are pretty stingy with the file descriptors (I'm looking at *you*, solaris, and your default setting of 64-256).
mobrule
@mobrule - I'm sorry, I can not post your comment, due to there being no available file descriptors - Sincerely, StackOverflow Solaris Backend....
DVK
+7  A: 

There won't be any drastic consequences. There will be a very slight increase in memory usage, from the kernel itself, which can't free the iterator it uses internally for looping through the list of directory entries, and probably also from the perl side.

Adittionally, as long as any descriptor to a directory is still open, the data can't be actually deleted from the filesystem. If some other external process would delete the directory you have the handle for, it would stop appearing in future directory listings, but the data would still have to be kept on disk and would still be accessible by the process with the opened handle. That might result in odd numbers in disk usage, for example.

Also note that you don't necessarily have to close all your handles manually. When using lexical filehandles, closing happens automatically as soon as the last reference to the handle goes away:

 { # new scope
     opendir(my $handle, ...) or ...;
     ...
 } # implicit closedir happens here
rafl
+5  A: 

This is a lesson to always use lexical file-(and directory-) handles -- lexical handles are automatically closed when they go out of scope.

So you'd only be wasting descriptors (as Jonathan describes) if 1. you used the old-style glob handle, or 2. all the code is in a flat script with no subroutines or other scoping. Use good programming practices and inadvertent errors will be fewer :)

Ether
This opendir was in main, so it was globally scoped... You are absolutely right though - good programming practices will definitely reduce inadvertent errors!
BrianH
According to your code, you used a bareword handle anyway. Those are always globally scoped unless you explicitly localize them.
rafl