views:

232

answers:

6

In holy wars about whether garbage collection is a good thing, people often point out that it doesn't handle things like freeing file handles. Putting this logic in a finalizer is considered a bad thing because the resource then gets freed non-deterministically. However, it seems like an easy solution would be for the OS to just make sure lots and lots of file handles are available so that they are a cheap and plentiful resource and you can afford to waste a few at any given time. Why is this not done in practice?

+1  A: 

It's not just the amount of file handles, it's that sometimes when they're used in some modes, they can prevent other callers from being able to access the same file.

M1EK
+2  A: 

Closing a file also flushes the writes to disk -- well, from the point of view of your application anyway. After closing a file, the application can crash, as long as the system itself doesn't crash the changes will not be lost. So it's not a good idea to let the GC close files at its leisure. Even if it may be technically possible nowadays.

Also, to tell the truth, old habits die hard. File handles used to be expensive and are still probably considered as such for historical reasons.

Pascal Cuoq
+2  A: 

I'm sure more comprehensive answers will ensue, but based on my limited experience and understanding of the underlying operation of Windows, file handles (the structures used to represent them to the OS) are kernel objects and as such they require a certain type of memory to be available - not to mention processing on the kernel part to maintain consistency and coherence with multiple processes requiring access to the same resources (i.e. files)

Miky Dinescu
If you mean kernel-space memory, a 64-bit kernel has as much of that as it may possibly need for now and the foreseeable future.
Pascal Cuoq
+1  A: 

I don't think they're necessarily expensive - if your application only holds a few unnessary ones open it won't kill the system. Just like if you leak only a few strings in C++ no one will notice, unless they're looking pretty carefully. Where it becomes a problem is:

  • if you leak hundreds or thousands
  • if having the file open prevents other operations from occurring on that file (other applications might not be able to open or delete the file)
  • it's a sign of sloppiness - if your program can't keep track of what it owns and is using or has stopped using, what other problems will the program have? Sometimes a small leak turns into a big leak when something small changes or a user does something a little differently than before.
Michael Burr
Unless, of course, your buffers aren't written properly because your leaked file handle wasn't closed properly. In that -- very common -- case, a single leaked handle can be a debugging nightmare.
S.Lott
+5  A: 

In practice, it cannot be done because the OS would have to allocate a lot more memory overhead for keeping track of which handles are in use by different processes. In an example C code as shown below I will demonstrate a simple OS process structure stored in a circular queue for an example...

struct ProcessRecord{
  int ProcessId;
  CPURegs cpuRegs;
  TaskPointer **children;
  int *baseMemAddress;
  int sizeOfStack;
  int sizeOfHeap;
  int *baseHeapAddress;
  int granularity;
  int time;
  enum State{ Running, Runnable, Zombie ... };
  /* ...few more fields here... */
  long *fileHandles;
  long fileHandlesCount;
}proc;

Imagine that fileHandles is an pointer to an array of integers which each integer contains the location (perhaps in encoded format) for the offset into the OS's table of where the files are stored on disk.

Now imagine how much memory that would eat up and could slow down the whole kernel, maybe bring about instability as the 'multi-tasking' concept of the system would fall over as a result of having to keep track of how much filehandles are in use and to provide a mechanism to dynamically increase/decrease the pointer to integers which could have a knock on effect in slowing down the user program if the OS was dishing out file handles on a user program's demand basis.

I hope this helps you to understand why it is not implemented nor practical.

Hope this makes sense, Best regards, Tom.

tommieb75
Can you please leave a comment on why this was downvoted? Thanks. :|
tommieb75
No clue why this was downvoted... +1
RCIX
@RCIX: Thanks - it's incredible at the speed of posting I got downvoted without leaving a comment...
tommieb75
+1 great explanation
0A0D
That's a great explanation for programs running on the toy system that you just invented, but *real* OSs don't have a static array of FDs per process.
hobbs
@hobbs: Really? Many OS's actually do have separate pools of pre-allocated memory for this kind of thing to eliminate the overhead of dynamic allocation.
S.Lott
@hobbs: His array does not look static to me. long* and a long count looks dynamic.
Zan Lynx
A: 

In the Linux paradigm sockets are file descriptors. There are definite advantages to freeing up TCP ports as soon as possible.

caspin