views:

3548

answers:

3

I'm using Grails 1.1 beta2. I need to import a large amount of data into my Grails application. If I repeatedly instantiate a grails domain class and then save it, the performance is unacceptably slow. Take for example importing people from a phone book:

for (each person in legacy phone book) {
    // Construct new Grails domain class from legacy phone book person
    Person person = new Person(...)
    person.save()
}

This turns out be painfully slow. Someone on the Grails mailing list suggest batching up saves in a transaction. So now I have:

List batch = new ArrayList()
for (each person in legacy phone book) {
    // Construct new Grails domain class from legacy phone book person
    Person person = new Person(...)
    batch.add(person)
    if (batch.size() > 500) {
        Person.withTransaction {
            for (Person p: batch)
                p.save()
            batch.clear()
        }
    }
}
// Save any remaining
for (Person p: batch)
    p.save()

This works must faster, at least initially. Each transaction saves 500 records. As time goes on, the transactions take longer and longer. The first few transactions takes about 5 seconds, then it just creeps from there. After about 100 transactions, each one takes over a minute, which is once again unacceptable. Worse is that eventually Grails will eventually run out of Java heap memory. I can increase the JVM heap size, but that just delays the OutOfMemoryError exception.

Any ideas why this is? It's like there some internal resource not being released. The performance gets worse, memory is being held on to, and then eventually the system runs out of memory.

According to the Grails documentation, withTransaction passes the closure to Spring's TransactionStatus object. I couldn't find anything in TransactionStatus to close/end the transaction.

Edit: I'm running this from Grails' Console (grails console)

Edit: Here's the out of memory exception:

Exception thrown: Java heap space

java.lang.OutOfMemoryError: Java heap space
    at org.hibernate.util.IdentityMap.entryArray(IdentityMap.java:194)
    at org.hibernate.util.IdentityMap.concurrentEntries(IdentityMap.java:59)
    at org.hibernate.event.def.AbstractFlushingEventListener.prepareEntityFlushes(AbstractFlushingEventListener.java:113)
    at org.hibernate.event.def.AbstractFlushingEventListener.flushEverythingToExecutions(AbstractFlushingEventListener.java:65)
    at org.hibernate.event.def.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:26)
    at org.hibernate.impl.SessionImpl.flush(SessionImpl.java:1000)
    at org.hibernate.impl.SessionImpl.managedFlush(SessionImpl.java:338)
    at org.hibernate.transaction.JDBCTransaction.commit(JDBCTransaction.java:106)
    at org.springframework.orm.hibernate3.HibernateTransactionManager.doCommit(HibernateTransactionManager.java:655)
    at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:732)
    at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:701)
    at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:140)
+5  A: 

This is a common issue with all hibernate applications and it is caused by the growth of the hibernate session. I'm guessing that the grails console holds a hibernate session open for you in a similar way to the 'open session in view' pattern that I know it uses in for normal web requests.

The solution is to get hold of the current session and clear it after each batch. I'm not sure how you get hold of spring bean using the console, normally for controllers or services you just declare them as members. Then you can get the current session with sessionFactory.getCurrentSession(). In order to clear it just call session.clear(), or if you what to be selective use session.evict(Object) for each Person object.

for a controller/service:

class FooController {
    def sessionFactory

    def doStuff = {
        List batch = new ArrayList()
        for (each person in legacy phone book) {
            // Construct new Grails domain class from legacy phone book person
            Person person = new Person(...)
            batch.add(person)
            if (batch.size() > 500) {
                Person.withTransaction {
                    for (Person p: batch)
                        p.save()
                    batch.clear()
                }
            }
            // clear session here.
            sessionFactory.getCurrentSession().clear();
        }
        // Save any remaining
        for (Person p: batch)
            p.save()
        }
    }
}

Hope this helps.

Gareth Davis
+3  A: 

Ted Naleid wrote a great blog entry about improving batch performance. Including here as a reference.

Jean Barmash
A: 

Actually when you have long running batch tasks this is not the only issue.

I've wrote an article summarizing possible problems and solutions: http://www.componentix.com/blog/8

Vladimir Grichina