views:

364

answers:

2

The underlying problem I want to solve is running a task that generates several temporary tables in MySQL, which need to stay around long enough to fetch results from Java after they are created. Because of the size of the data involved, the task must be completed in batches. Each batch is a call to a stored procedure called through JDBC. The entire process can take half an hour or more for a large data set.

To ensure access to the temporary tables, I run the entire task, start to finish, in a single Spring transaction with a TransactionCallbackWithoutResult. Otherwise, I could get a different connection that does not have access to the temporary tables (this would happen occasionally before I wrapped everything in a transaction).

This worked fine in my development environment. However, in production I got the following exception:

java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction

This happened when a different task tried to access some of the same tables during the execution of my long running transaction. What confuses me is that the long running transaction only inserts or updates into temporary tables. All access to non-temporary tables are selects only. From what documentation I can find, the default Spring transaction isolation level should not cause MySQL to block in this case.

So my first question, is this the right approach? Can I ensure that I repeatedly get the same connection through a Hibernate template without a long running transaction?

If the long running transaction approach is the correct one, what should I check in terms of isolation levels? Is my understanding correct that the default isolation level in Spring/MySQL transactions should not lock tables that are only accessed through selects? What can I do to debug which tables are causing the conflict, and prevent those tables from being locked by the transaction?

+1  A: 

When you say your table is temporary, is it transaction scoped? That might lead to other transactions (perhaps on a different transaction) not being able to see/access it. Perhaps a join involving a real table and a temporary table somehow locks the real table.

Root cause: Have you tried to use the MySQL tools to determine what is locking the connection? It might be something like next row locking. I don't know the MySQL tools that well, but on oracle you can see what connections are blocking other connections.

Transaction timeout: You should create a second connection pool/data source with a much longer timeout. Use that connection pool for your long running task. I think your production environment is 'trying' to help you out by detecting stuck connections.

Justin
+1  A: 

I consider keeping transaction open for an extended time evil. During my career the definition of "extended" has descended from seconds to milli-seconds.

It is an unending source of non-repeatable problems and headscratching problems.

I would bite the bullet in this case and keep a 'work log' in software which you can replay in reverse to clean up if the batch fails.

Peter Tillemans
Do you have a solution to the "maintain connection" problem? In my case, Spring and Hibernate mediate my connection to the database, and I have had trouble getting the same connection every time, without resorting to wrapping everything in a single transaction. Without maintaining the same connection, I lose access to the temporary tables.
jimbokun
If you keep a worklog with an id and the milestones reached in the batch processing you can create regular tables named like TEMP_<batchid>_STUFF. Based on the milestone reached you can skip ahead and continue with aborted batch processes. After the processing is one (i.e. reached milestone DONE) remove the tables and log you reached milestone 'CLEANED'. All processing can now be done with fine grained transactions. Inconvenient in this approach is that you do not see a "snapshot" of the state of the DB when the batch job started. This may or may not be an issue.
Peter Tillemans
That's very good advice. I have one remaining issue, however. Some of the initial processing creates temporary tables that are accessed in each batch. This initial processing takes a significant amount of time, enough that it would be a significant performance hit if it had to be repeated for each batch.I'm not sure how to handle this initial processing, but I'll give it some more thought.
jimbokun