views:

112

answers:

3

I have a system that keeps running out of disk space because that garbage collector doesn't free the objects that hold the file handles fast enough. (Files are created and deleted all the time, but since the process still has an open file handle, the OS keeps it on the disk).

The objects are shared, so a simple try { ... } finally { close(); } will not do.

It seems to me that my best option is to implement reference counting on the objects, and close the file handles when the reference count goes to 0. However, I'm reluctant to implement it all by myself, as I suspect there are subtle issues with regards to concurrency.

Sadly, googling for "reference counting in java" doesn't bring any useful results. So my question is: are there any resources (articles, sample code, libraries) that can help implement reference counting?

+1  A: 

You can do reference counting by wrapping your object inside a WeakReference and then using the ReferenceQueue. However, it seems like you just want to discover when your handle is no longer referenced, so you don't really need to count at all.

But the same method (ie a WeakReference) will do; just close a handle when the referent becomes null. However, you may need a bespoke subclass of WeakReference with an extra reference to the file handle, so that you can close it (otherwise you won't have access to the file handle). For example:

public class WeakFileReference extends WeakReference<File> {

  private final File  handle;
  public WeakFileReference(File handle, ReferenceQueue q) {
    super(handle, q);
    this.handle = (File)handle.clone();
  }
}

I've not checked this completely and can't be sure how you are using the File object in your program: I assume that you are sharing a File instance around.

oxbow_lakes
Doesn't that still rely on the Garbage Collector? Isn't doing this equivalent to implementing a finalizer? (Tried, doesn't get called fast enough).
itsadok
Yes, it depends on the GC but is *slightly* different from `finalize`. In my experience the latter is dangerous for large numbers of objects, or with high throughput, because often the finalization cannot keep up with the rate of object creation causing OutOfMemory problems
oxbow_lakes
A: 

I still think that the best way should be to close all file handles as soon as you are done with them. Why exactly does this not work? I did not understand the part about objects being shared. The file handle only needs to be open while you have an open stream to the file. It does not need to be open for the whole lifetime of the file.

Thilo
The objects are Lucene IndexSearchers, so I don't control when the file handles are opened and closed, and I wouldn't want to keep creating new IndexSearchers
itsadok
It sounds like you are searching against a Lucene index that is constantly being updated. That is probably not the best way to use Lucene. Maybe open another question and describe your scenario. Hopefully, someone with Lucene know-how will have an answer (that would work better than trying to keep track of multiple IndexSearchers and their lifetimes).
Thilo
+7  A: 

Do not depend on the garbage collector. It is intentionally designed not to be dependable.

If "shared" means that you use it several places in your code so you cannot just close it, I would suggest you change your code to have a central pool of files, where you can "check out" a file handle to be used in your code locally. The close() procedure then returns the file handle to the pool. Keep track of your handles, when all handles for a given file is returned to the pool, you close the file for good.

Thorbjørn Ravn Andersen