ansaurus

Question

Database perform advice when batch importing large datasets

Answer 1

+1 A:

There two solutions that I have found that work. One is to process a chunk at a time. After each chunk closing an restarting the session. I have tried to use the flush clear methods on the session but sometimes it just function like you'd expect. Starting and stopping the transaction between batches seems to work the best.

If performance is a major concern you just just break down and do it in JDBC. Hibernate adds too much overhead for batch processing of large datasets, where memory and performance is important.

Ruggs 2008-12-23 03:46:28

One of the reasons that flush doesn't always free up the memory is that hibernate will store references to all of the objects saved in the transaction so that it can call post commit hooks on the objects, if any are defined, at the end of the transaction.

davidsheldon 2009-01-04 10:04:52

Answer 2

A:

Your approach is going to result in way too many individual queries against the database; looks like 4n + 1. If possible, I would write a query (perhaps in raw SQL) that checks for the existence of person + address all in one shot.

You may want to work with the StatelessSession instead of the standard Hibernate Session. Since it doesn't have a 1st level cache it should keep your memory requirements lower.

http://www.hibernate.org/hib_docs/reference/en/html/batch-statelesssession.html

If that doesn't work for you then you'll want to take a look at the batch options in Hibernate:

http://www.hibernate.org/hib_docs/reference/en/html/batch.html

cliff.meyers 2009-01-01 06:10:58

ansaurus

tags:

views:

answers:

Database perform advice when batch importing large datasets

related questions