views:

608

answers:

9

I'm a c++ programmer and I'm playing around with java after finding JPA which for a few of my current applications is a god send. I haven't touched java since university and I'm having a problem running out of heap space. I'm using the code below as the main part of a not-very-serious test of jdbc/jpa/lucene but I keep on getting random OutOfMemory exceptions.

        EntityManager em = emf.createEntityManager();
        Query q = em.createQuery("select p from Product p" +
            " where p.productid = :productid");
        Connection con = DriverManager.getConnection("connection string");
        Statement st = con.createStatement();

        IndexWriter writer = new IndexWriter("c:\\temp\\lucene", new StandardAnalyzer(), IndexWriter.MaxFieldLength.LIMITED);

        ResultSet rs = st.executeQuery("select productid from product order by productid");
        while (rs.next()) {
            int productid = rs.getInt("PRODUCTID");
            q.setParameter("productid", productid);
            Product p = (Product)q.getSingleResult();

            writer.addDocument(createDocument(p));
        }

        writer.commit();
        writer.optimize();
        writer.close();

        st.close();
        con.close();

I won't post all of createDocument but all it does is instantiate a new org.apache.lucene.document.Document and adds fields via add(new Field...) etc. There are about 50 fields in total and most are short strings (<32 characters) in length.

In my newby-ness is there something completely stupid I'm doing (or not) that would cause things not to be GC'd?

Are there best practices reagrding java memory management and tickling the GC?

+3  A: 

I don't see anything obviously out of place. If you're working with a very large database, you could try increasing your heap size by using the -Xmx n option in your JVM invocation. This is usually not the best solution - only do with this when you know your working set size is actually bigger than the default heap size.

Are you using any complex data structures? If you have circular references between objects, you might be preventing the garbage collector from cleaning up unreachable objects. If you have any hand-written data structures, make sure that you explicitly null out references to objects that are removed instead of doing something like decrementing a size variable.

Adam Rosenfield
GC has no issues with circular references, objects will still be removed. I am not sure I understand what you mean by explicitly null out references to objects. How else would you remove them?
Robin
The other way of removing them is letting them go out of scope.Explicitly setting references to null is for the combination of humongous objects and long loops, where the object reference would be kept in memory for a long time.
gnud
+2  A: 

Try the SAP memory analyzer.

https://www.sdn.sap.com/irj/sdn/wiki?path=/display/Java/Java+Memory+Analysis

This reads in a dump file and lets you investigate what is taking up the memory.

WW
Your link isn't taking me to it. Please fix the link!
Richard T
A: 

The only circular references come from the JPA entities which I generated automatically in NetBeans. Does the fact that it's JPA make a difference to garbage collection?

A: 

How many items are in your result set? If there are enough records, than you will use up all your memory, as there is nothing garbage collected in this case as you are doing an addDocument to the writer, which will hold a reference to all the documents you are creating.

Robin
+2  A: 

Well...

Long experience with Java and databases (an example posthttp://stackoverflow.com/questions/216601/postgressql-mysql-oracle-diferences#217230>) has taught me that the JDBC drivers we use in doing this work frequently have problems.

I have one piece of code that needs to remain connected to a database 24/7 and because of a driver memory leak the JVM would always choke at some point. So, I wrote code to catch the specific exception thrown and then take ever increasingly drastic action, including dropping the connection and reconnecting and even restarting the JVM in a desperate, nothing's working to clear the problem circumstance. What a PAIN to have to write it, but it worked until the DBMS vendor came out with a new JDBC driver that didn't cause the problem... I actually just left the code in place, just in case!

...So, it could be nothing you are doing.

Note that calling the garbage collector was one of the strategies I used, but metrics showed it seldom helped.

Additionally, it may not be clear, but ResultSets maintain an ongoing connection to the database engine itself, in many cases (unless explicitly set otherwise) bi-directional, even if you're just reading. And, some JDBC drivers let you ask for a mono-directional connection but lie and return a bi-directional one! Beware with this!

So, it's good practice to unload your ResultSet objects into other objects to hold the values and drop the ResultSet objects themselves as soon as possible.

Good luck. RTIII

Richard T
A: 

@WW Thanks, looking at it now.

@Robin There are around 150000. The docs for IndexWriter say by default it flushes the document buffer when it reaches 16MB. Is that what you're reffering to?

A: 

Ok, I have tried:

  • removing lucene from the test
  • increasing the heap size to 128MB (minimum), and then 256MB
  • analysing the heap dump through MAT
  • profiling it using JProbe

Removing lucene and increasing the heap size made absolutely no difference, and MAT and JProbe show that out of up to 80MB of memory that the application takes up, about 62.5% is the JVM. Most of the time only about 60MB of memory is used (including the JVM).

How can I be running out of memory if the program isn't using close to the heap size?

Does anyone have any more ideas?

A: 

Java maintains several different memory pools, and running out of any one of them can cause the dreaded OutOfMermoryException. Problems allocating memory by the Operating System can also manifest as an OOM.

You should see a detailed stack trace - or possibly an error dump file in the application's directory - that may give further clues as to the problem.

If you use a decent profiler - JVisualVM that ships with recent Sun Java 6 JDKs is probably sufficient - you can watch all the various pools and see which ones are running out.

Bill Michell
+1  A: 

Probably you are running out of space for the Permanent Generation. Check if your stack trace contains something like java.lang.OutOfMemoryError: PermGen

You can increase the space for this generation with this parameter for the jvm: -XX:MaxPermSize=128m

Objects in the permanent generation are not considered during garbage collection. Take a look at this page from sun to learn more about garbage collection and the different generations of objects in the JVM.

Turismo