views:

45

answers:

1

Hi. I'm writing an application to analyse a MySQL database, and I need to execute several DMLs simmultaneously; for example:

// In ResultSet rsA: Select * from A;
rsA.beforeFirst();
while (rsA.next()) {
   id = rsA.getInt("id");
   // Retrieve data from table B: Select * from B where B.Id=" + id;
   // Crunch some numbers using the data from B
   // Close resultset B
}

I'm declaring an array of data objects, each with its own Connection to the database, which in turn calls several methods for the data analysis. The problem is all threads use the same connection, thus all tasks throw exceptios: "Lock wait timeout exceeded; try restarting transaction"

I believe there is a way to write the code in such a way that any given object has its own connection and executes the required tasks independent from any other object. For example:

DataObject dataObject[0] = new DataObject(id[0]);
DataObject dataObject[1] = new DataObject(id[1]);
DataObject dataObject[2] = new DataObject(id[2]);
...
DataObject dataObject[N] = new DataObject(id[N]);
// The 'DataObject' class has its own connection to the database, 
// so each instance of the object should use its own connection. 
// It also has a "run" method, which contains all the tasks required.
Executor ex = Executors.newFixedThreadPool(10);

for(i=0;i<=N;i++) {
   ex.execute(dataObject[i]);
}
// Here where the problem is: Each instance creates a new connection,
// but every DML from any of the objects is cluttered in just one connection
// (in MySQL command line, "SHOW PROCESSLIST;" throws every connection, and all but
// one are idle).

Can you point me in the right direction?

Thanks

A: 

I think the problem is that you've confounded a lot of middle tier, transactional, and persistent logic into one class.

If you're dealing directly with ResultSet, you're not thinking about things in a very object-oriented fashion.

You're smart if you can figure out how to get the database to do some of your calculations.

If not, I'd recommend keeping Connections open for the minimum time possible. Open a Connection, get the ResultSet, map it into an object or data structure, close the ResultSet and Connection in local scope, and return the mapped object/data structure for processing.

You keep persistence and processing logic separate this way. You save yourself a lot of grief by keeping connections short-lived.

If a stored procedure solution is slow it could be due to poor indexing. Another solution will perform equally poorly if not worse. Try running EXPLAIN PLAN and see if any of your queries are using TABLE SCAN. If yes, you have some indexes to add. It could also be due to large rollback logs if your transactions are long-running. There's a lot you could and should do to ensure you've done everything possible with the solution you have before switching. You could go to a great deal of effort and still not address the root cause.

duffymo
@Keith Randall I'm using mysql connector 5.1.7
Barranka
@duffymoThanks... I'll try the data structure approach. I've already tried to solve the problem using sql procedures stored in the database, but they are terribly slow.
Barranka