ansaurus

Question

How to bypass JDBC statement cache in concurrent batch processing?

Answer 1

+1 A:

Are you trying to use multiple statements there from a single connection instance? IMO, a connection pool is recommended for the behaviour you describe. The alternative is to synchrnonize manually.

Everyone 2009-08-25 09:57:28

Thank you for your answer. Each thread has one connection. Each connection has multiple statements, one PreparedStatement instance for each separate SQL statement. Each statement contains a batch. The problems is that because of statement caching, each thread does not have a unique set of statements which causes problems with the batches. Connection pool and statement cache do not really help here as the connection and statement preparation events are few and far between.

Aleksi 2009-08-25 10:06:12

Still trying to understand the issue - is the order of execution for the statements the problem?

Everyone 2009-08-25 10:59:06

I edited the question for (hopefully) some clarification of the scenario. Execution order causes the crash, yes. On the other hand shared statements and batches make it impossible to trust the local state. This, in my opinnion, causes avoidable / non-beneficial concurrency in my scenario i.e. managing the batches separate from the actual working thread.

Aleksi 2009-08-25 11:29:50

Ah. No.. to my understanding a mechanism to identify which Thread may execute which statement(s) needs to be provided manually. )+: Sorry I couldn't be more help.

Everyone 2009-08-25 13:07:55

Answer 2

+1 A:

The solution is vendor-specific.

If your code runs under a servlet, then you might be able to solve your problem by configuring the datasource in your webapp. I have done that with Oracle driver under Tomcat, but I'm sure other application servers have similar ways to configure connection pooling.

If your code is standalone, then you'll have to use vendor-specific API. As you will target Oracle as your production database, here's a quick example for Oracle JDBC driver:

import oracle.jdbc.OracleConnection;

...

public static void disableStatementCaching(java.sql.Connection conn)
        throws SQLException {
    ((OracleConnection)conn).setImplicitCachingEnabled(false);
}

...

For more info, see JDBC dev guide for Oracle 10.2

Juris 2009-08-25 14:48:18

Thanks for you answer. I will definitely have a look in vendor specific APIs even though I'm bit skeptical about using them. The application doesn't run as a servlet. It's published as a web service as defined by EJB3 @WebService -annotation. However, using an application server defined datasource and configuring it is entirely possible. It just needs proper documentation which probably will lead to a situation where a future developer will decide to optimize the performance by enabling statement caching.

Aleksi 2009-08-26 04:21:13

Answer 3

A:

My current solution is stop worrying and start loving the shared batches. I have split the processing algorithm to two phases

Parse a set of N records and save them in an intermidiate format
Persist the set of N records as a batch when a lock is awarded to the current thread

This allows the parsing to concurrent and batching sequential. I'll just have to find a sweet spot to minimize the waiting time between threads.

The quest for a sweet spot may lead to implementing some sort of a two-phased locking scheme i.e. let each thread do as they please and on commit, make sure all threads have completed their current record before the actual batch execution.

In the latter solution it might be necessary to synchronize over parameter setting for each PreparedStatement although I haven't tested if that causes any trouble. It should.

Aleksi 2009-08-28 05:39:46

ansaurus

tags:

views:

answers:

How to bypass JDBC statement cache in concurrent batch processing?

related questions