views:

225

answers:

4

I am using hibernate to update 20K products in my database.

As of now I am pulling in the 20K products, looping through them and modifying some properties and then updating the database.

so:

load products

foreach products
   session begintransaction
   productDao.MakePersistant(p);
   session commit();

As of now things are pretty slow compared to your standard jdbc, what can I do to speed things up?

I am sure I am doing something wrong here.

+5  A: 

If this is pseudo-code, I'd recommend moving the transaction outside the loop, or at least have a double loop if having all 20K products in a single transaction is too much:

load products
foreach (batch)
{
   try
   {
      session beginTransaction()
      foreach (product in batch)
      {
          product.saveOrUpdate()
      }
      session commit()
   }
   catch (Exception e)
   {
       e.printStackTrace()
       session.rollback()
   }
}

Also, I'd recommend that you batch your UPDATEs instead of sending each one individually to the database. There's too much network traffic that way. Bundle each chunk into a single batch and send them all at once.

duffymo
A: 

The fastest possible way to do a batch update would be to convert it to a single SQL statement and execute it as raw sql on the session. Something like

update TABLE set (x=y) where w=z;

Failing that you can try to make less transactions and do updates in batches:

start session
start transaction

products = session.getNamedQuery("GetProducs")
    .setCacheMode(CacheMode.IGNORE)
    .scroll(ScrollMode.FORWARD_ONLY);
count=0;
foreach product
    update product
    if ( ++count % 20 == 0 ) {
        session.flush();
        session.clear();
    }
}

commit transaction
close session

For more information look at the Hibernate Community Docs

leonm
+7  A: 

The right place to look at in the documentation for this kind of treatment is the whole Chapter 13. Batch processing.

Here, there are several obvious mistakes in your current approach:

  • you should not start/commit the transaction for each update.
  • you should enable JDBC batching and set it to a reasonable number (10-50):

    hibernate.jdbc.batch_size 20
    
  • you should flush() and then clear() the session at regular intervals (every n records where n is equal to the hibernate.jdbc.batch_size parameter) or it will keep growing and may explode (with an OutOfMemoryException) at some point.

Below, the example given in the section 13.2. Batch updates illustrating this:

Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();

ScrollableResults customers = session.getNamedQuery("GetCustomers")
    .setCacheMode(CacheMode.IGNORE)
    .scroll(ScrollMode.FORWARD_ONLY);
int count=0;
while ( customers.next() ) {
    Customer customer = (Customer) customers.get(0);
    customer.updateStuff(...);
    if ( ++count % 20 == 0 ) {
        //flush a batch of updates and release memory:
        session.flush();
        session.clear();
    }
}

tx.commit();
session.close();

You may also consider using the StatelessSession.

Another option would be to use DML-style operations (in HQL!): UPDATE FROM? EntityName (WHERE where_conditions)?. This the HQL UPDATE example:

Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();

String hqlUpdate = "update Customer c set c.name = :newName where c.name = :oldName";
// or String hqlUpdate = "update Customer set name = :newName where name = :oldName";
int updatedEntities = s.createQuery( hqlUpdate )
        .setString( "newName", newName )
        .setString( "oldName", oldName )
        .executeUpdate();
tx.commit();
session.close();

Again, refer to the documentation for the details (especially how to deal with the version or timestamp property values using the VERSIONED keyword).

Pascal Thivent
+1  A: 

I agree with the answer above about looking at the chapter on batch processing.

I also wanted to add that you should make sure that you only load what is neccessary for the changes that you need to make for the product.

What I mean is, if the product eagerly loads a large number of other objects that are not important for this transaction, you should consider not loading the joined objects - it will speed up the loading of products and depending on their persistance strategy, may also save you time when making the product persistent again.

Rachel