views:

111

answers:

2

I have a terrible problem that hopefully has a very simple answer. I am running out of memory when I perform a basic

If I have code like this:

MyEntity myEntity;
for (Object id: someIdList) {
   myEntity = find(id); 
   // do something basic with myEntity
}

And the find() method is a standard EntityManager related method:

public MyEntity find(Object id) {
    return em.find(mycorp.ejb.entity.MyEntity.class, id);
}

This code worked a couple of weeks ago, and works fine if there are fewer items in the database. The resulting error I am facing is:

java.lang.OutOfMemoryError: GC overhead limit exceeded

The exception is coming from oracle toplink calling some oracle jdbc methods.

The loop exists because an EJBQL such as "select object(o) from MyEntity as o" will overload the application server when there are lots of records.

+2  A: 

The problem is, if you do the loop you just query one entity after another, but it is still referenced in your EntityManager. You must either

  • clear() the entityManager, or
  • remove the entity from it (forgot how the function for this was called).

It is also a good idea to set the entity manager readonly if possible, because that stops JPA from holding a copy of each of your objects in memory, just to detect if it has possible changes in case of a flush() to the database.

Daniel
+3  A: 

This code worked a couple of weeks ago, and works fine if there are fewer items in the database. The resulting error I am facing is: java.lang.OutOfMemoryError: GC overhead limit exceeded

And there is nothing surprising here. Entities loaded by em.find() are put and kept in the persistence context (in memory) to track changes so if you bulk load too much of them without precautions, you'll just explode your memory and get OOME.

If you really need to do something with all of your entities, you need to call flush() first to push all the changes to the database and then clear() to clear the persistent context and release memory at regular intervals:

int i = 0;
for (Object id: someReallyBigIdList) {
    myEntity = find(id); 
    // do something basic with myEntity
    if ( i % 20 == 0 ) { //20, same as the JDBC batch size
        //flush a batch of DML operations and release memory:
        em.flush();
        em.clear();
    }
    i++;
}

Calling clear() causes all managed entities to become detached. Changes made to entities that have not been flushed to the database will not be persisted. Hence the need to flush() the changes first.

Pascal Thivent