views:

128

answers:

6

Consider an object declared in a method:

public void foo() {
    final Object obj = new Object();

    // A long run job that consumes tons of memory and 
    // triggers garbage collection
}

Will obj be subject to garbage collection before foo() returns?

UPDATE: Previously I thought obj is not subject to garbage collection until foo() returns.

However, today I find myself wrong.

I have spend several hours in fixing a bug and finally found the problem is caused by obj garbage collected!

Can anyone explain why this happens? And if I want obj to be pinned how to achieve it?

Here is the code that has problem.

public class Program
{
    public static void main(String[] args) throws Exception {
        String connectionString = "jdbc:mysql://<whatever>";

        // I find wrap is gc-ed somewhere
        SqlConnection wrap = new SqlConnection(connectionString); 

        Connection con = wrap.currentConnection();
        Statement stmt = con.createStatement(ResultSet.TYPE_FORWARD_ONLY, 
             ResultSet.CONCUR_READ_ONLY);
        stmt.setFetchSize(Integer.MIN_VALUE);

        ResultSet rs = stmt.executeQuery("select instance_id, doc_id from
               crawler_archive.documents");

        while (rs.next()) {
            int instanceID = rs.getInt(1);
            int docID = rs.getInt(2);

            if (docID % 1000 == 0) {
                System.out.println(docID);
            }
        }

        rs.close();
        //wrap.close();
    }
}

After running the Java program, it will print the following message before it crashes:

161000
161000
********************************
Finalizer CALLED!!
********************************
********************************
Close CALLED!!
********************************
162000
Exception in thread "main" com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: 

And here is the code of class SqlConnection:

class SqlConnection
{
    private final String connectionString;
    private Connection connection;

    public SqlConnection(String connectionString) {
        this.connectionString = connectionString;
    }

    public synchronized Connection currentConnection() throws SQLException {
        if (this.connection == null || this.connection.isClosed()) {
            this.closeConnection();
            this.connection = DriverManager.getConnection(connectionString);
        }
        return this.connection;
    }

    protected void finalize() throws Throwable {
        try {
            System.out.println("********************************");
            System.out.println("Finalizer CALLED!!");
            System.out.println("********************************");
            this.close();
        } finally {
            super.finalize();
        }
    }

    public void close() {
        System.out.println("********************************");
        System.out.println("Close CALLED!!");
        System.out.println("********************************");
        this.closeConnection();
    }

    protected void closeConnection() {
        if (this.connection != null) {
            try {
                connection.close();
            } catch (Throwable e) {
            } finally {
                this.connection = null;
            }
        }
    }
}
+1  A: 

Here, obj is a local variable in the method and it is popped off the stack as soon as the method returns or exits. This leaves no way to reach the Object object on the heap and hence it will be garbage collected. And the Object object on the heap will be GC'd only after its reference obj is popped off the stack,ie, only after the method finishes or returns.


EDIT: To answer your update,

UPDATE: Let me make the question more clear. 
Will obj be subject to garbage collection before foo() returns?

obj is just a reference to the actual object on the heap. Here obj is declared inside the method foo(). So your question Will obj be subject to garbage collection before foo() returns? doesnot apply as obj goes inside the stack frame when the method foo() is running and is gone when the method finishes.

Zaki
obj is a reference type, so only the reference (the pointer) is stored on the stack, the actual data is in the heap.
Simon P Stevens
@Simon P Stevens, thats exactly what I mentioned. The reference variable obj is a LOCAL VARIABLE.
Zaki
Sorry. Didn't read the tags. Thinking of .net. (I've edited so I can revoke my downvote). Sorry.
Simon P Stevens
Simon, this aspect (stack reference variables, heap objects) is the same for both Java and C# (though other things, like `KeepAlive`, are *not*).
Matthew Flaschen
@Matthew, yeah, I'd guess there are a lot of similarities, that's probably why I so easily mixed up the question in the first place - the terminology is exactly the same, but I haven't done any Java in years so I'm totally unqualified to even comment, hence I've deleted my answer.
Simon P Stevens
Actually there's another caveat to my previous comment. C# has value types like DateTime that can be allocated on the stack.
Matthew Flaschen
+2  A: 

There are really two different things happening here. obj is a stack variable being set to a reference to the Object, and the Object is allocated on the heap. The stack will just be cleared (by stack pointer arithmetic).

But yes, the Object itself will be cleared by garbage collection. All heap-allocated objects are subject to GC.

EDIT: To answer your more specific question, the Java spec does not guarantee collection by any particular time (see the JVM spec) (of course it will be collected after its last use). So it's only possible to answer for specific implementations.

EDIT: Clarification, per comments

Matthew Flaschen
It doesn't guarantee a time BUT that doesn't mean it will randomly collect your object whenever even if it is still in scope ;)
Paolo
@Paolo - the problem here is that it is out of scope after the last time it's used (not when it drops off the stack)
Ben Lings
And optmisations may make last use time surprising.
Tom Hawtin - tackline
+2  A: 

As I'm sure you're aware, in Java Garbage Collection and Finialization are non-deterministic. All you can determine in this case is when wrap is eligible for garbage collection. I'm assuming you are asking if wrap only becomes eligible for GC when the method returns (and wrap goes out of scope). I think that some JVMs (e.g. HotSpot with -server) won't wait for the object reference to be popped from the stack, it will make it eligible for GC as soon as nothing else references it. It looks like this is what you are seeing.

To summarise, you are relying on finalization being slow enough to not finalize the instance of SqlConnection before the method exits. You finalizer is closing a resource that the SqlConnection is no longer responsible for. Instead, you should let the Connection object be responsible for its own finalization.

Ben Lings
@Ben Lings: actually the close() method made public is to let the user explicitly dispose the object. However, if the user forgets to close() the object, the finalizer acts as the last guard to release the resource. This is much similar to the .NET dispose design model.
SiLent SoNG
The problem is that your finalizer is cleaning up a resource that it isn't responsible for.
Ben Lings
+3  A: 

As your code is written the object pointed to by "wrap" shouldn't be eligible for garbage collection until "wrap" pops off the stack at the end of the method.

The fact that it is being collected suggests to me your code as compiled doesn't match the original source and that the compiler has done some optimisation such as changing this:

SqlConnection wrap = new SqlConnection(connectionString); 
Connection con = wrap.currentConnection();

to this:

Connection con = new SqlConnection(connectionString).currentConnection();

(Or even inlining the whole thing) because "wrap" isn't used beyond this point. The anonymous object created would be eligible for GC immediately.

The only way to be sure is to decompile the code and see what's been done to it.

Paolo
I find this is the root cause of the problem.
SiLent SoNG
+5  A: 

I'm genuinely astonished by this, but you're right. It's easily reproducible, you don't need to muck about with database connections and the like:

public class GcTest {

    public static void main(String[] args) {
        System.out.println("Starting");

        Object dummy = new GcTest(); // gets GC'd before method exits

        // gets bigger and bigger until heap explodes
        Collection<String> collection = new ArrayList<String>();

        // method never exits normally because of while loop
        while (true) {
            collection.add(new String("test"));
        }
    }

    @Override
    protected void finalize() throws Throwable {
        System.out.println("Finalizing instance of GcTest");
    }
}

Runs with:

Starting
Finalizing instance of GcTest
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOf(Arrays.java:2760)
    at java.util.Arrays.copyOf(Arrays.java:2734)
    at java.util.ArrayList.ensureCapacity(ArrayList.java:167)
    at java.util.ArrayList.add(ArrayList.java:351)
    at test.GcTest.main(GcTest.java:22)

Like I said, I can hardly believe it, but there's no denying the evidence.

It does make a perverse kind of sense, though, the VM will have figured out that the object is never used, and so gets rid of it. This must be permitted by the spec.

Going back to the question's code, you should never rely on finalize() to clean up your connections, you should always do it explicitly.

skaffman
wow, great insight.
Zaki
+1  A: 

According to the current spec, there isn't even a happens-before ordering from finalisation to normal use. So, to impose order, you actually need to use a lock, a volatile or, if you are desperate, stashing a reference reachable from a static. There is certainly nothing special about scope.

It should be rare that you actually need to write a finaliser.

Tom Hawtin - tackline