views:

90

answers:

3

I am working on a latency sensitive part of an application, basically i will receive a network event transform the data and then insert all the data into the DB. After profiling i see that basically all my time is spent trying to save the data. here is the code

private void insertAllData(Collection<Data> dataItems)
{
    long start_time = System.currentTimeMillis();
    long save_time = 0;
    long commit_time = 0;
    Transaction tx = null;
    try
    {
        Session s = HibernateSessionFactory.getSession();
        s.setCacheMode(CacheMode.IGNORE);
        s.setFlushMode(FlushMode.NEVER);
        tx = s.beginTransaction();
        for(Data data : dataItems)
        {
            s.saveOrUpdate(data);
        }
        save_time = System.currentTimeMillis();
        tx.commit();
        s.flush();
        s.clear();
    }
    catch(HibernateException ex)
    {
        if(tx != null)
            tx.rollback();
    }
    commit_time = System.currentTimeMillis();
    System.out.println("Save: " + (save_time - start_time));
    System.out.println("Commit: " + (commit_time - save_time));
    System.out.println();
}

The size of the collection is always less than 20. here is the timing data that i see:

Save: 27
Commit: 9

Save: 27
Commit: 9

Save: 26
Commit: 9

Save: 36
Commit: 9

Save: 44
Commit: 0

This is confusing to me. I figure that the save should be quick and all the time should be spent on commit. but clearly I'm wrong. I have also tried removing the transaction (its not really necessary) but i saw worse times... I have set hibernate.jdbc.batch_size=20...

I can expect to get as many as 500 messages/sec so i need single message handling to be under 20 milliseconds.

i need this operation to be as fast as possible, ideally there would only be one roundtrip to the database. How can i do this?

+2  A: 

Honestly, I don't know what can be reasonably concluded from your test and from the "measures" you're showing (I suspect much overhead from the warmup, the collection is very small, and the sample is very small).

Anyway, I can tell you that your current code won't scale and you are very likely going to explode the Session when passing a bigger collection. You need to flush and clear the session at regular intervals (each 20 records if the batch size is 20).

Actually, I recommend reading the whole Chapter 13. Batch processing.

Pascal Thivent
i am flushing and clearing the session in the code above. collections will never be bigger than 20.
luke
A: 

Some basic stuff:

  • Do you have triggers, or foreign key constraints with no index?
  • Do you have batching drivers?
  • Are your drivers in batch mode (see hibernate.jdbc.batch_size from Pascal's reference)?
  • Any indexes on your tables (if you have a lot of indexes, it can sometimes slow down insert)?

Batching is part of JDBC 2.0, it allows you to execute several statements in a 'batch'; the idea is to reduce round trip latency (you can execute multiple batches per transaction).

Statement stmt = dbCon.createStatement("insert into DataTable values (?,?,?)");
stmt.setInt(1, x1); stmt.setInt(2, x2), stmt.setString(3, "some value");
stmt.addBatch();
...
stmt.setInt(1, x2); stmt.setInt(2, x3), stmt.setString(3, "some other value");
stmt.addBatch();

stmt.executeBatch();
dbCon.commit();

You can probably use this as a benchmark test. I would also look at the SQL that hibernate generates, to see if it is performing a query per insert to get the generated Ids.

Justin
how can i tell if i have a batching driver?
luke
+2  A: 

Move your primary key generation away from a server side auto-increment. Your java code must be responsible for the PK generation to avoid round trips.

For decent bulk insert performance you need a method that won't need to hit the database on every single call to saveOrUpdate. Using UUIDs as the primary key, or implementing HiLo can help achieve this, otherwise there's no bulk insert actually going on.

Michael
im currently using an oracle sequence to generate ids. is this not feasible?
luke
@Michael That was exactly it!, I removed the sequence and added a query to the startup to figure out where to start the sequence and bam, 7.5X speedup putting it well below my threshold.
luke
Glad to hear it worked :-)
Michael
Hibernate has the ability to generate ID sequences, you don't have to write that code your self. If you are using annotations you can add @Id with @GeneratedValue.
Justin
Justin makes an important point, I should have mentioned this ... Hibernate handles both methods out of the box with only a little configuration/annotation (no code required). If you want hilo all you need is to create a table and do a little config, Hibernate handles the rest http://docs.jboss.org/hibernate/stable/core/reference/en/html/mapping.html#mapping-declaration-id
Michael