I'm developing a server that should receive nightly reports from hundreds of business units. Reports are currently encrypted csv-files. In total the reports should amount to 500 000 to 1 000 000 records each day that are saved to a database for later use.
I've create a set of PreparedStatements for each transmission. These statements are used to batch 50 records before executing and commiting. Each record may cause up to 20 database inserts. All is well when transmissions are queued and handled one-by-one.
As I tried to do this concurrently I noticed that different threads got the exact same instances of the PreparedStatements. This caused the following problem
- Multiple threads added statements to the same batch
- Batches were being executed when any of the threads decided it was time to do so
- Commit was called when database did not meet it's constraints as some of the threads had not had time to use some of the statements
The question is: Is there are way to force a prepared statement to be created instead of reusing an existing one from the statement cache?
If not is there any better way to handle the situation than by
- creating a separate data source for the batches that does not have statement/connection pooling
- dropping constraints from the database; insert order would not matter anymore
- forcing sequential processing
Edit: Attempt to clarify the problem
Let there be threads T1 and T2. Let there be prepared statements S1 and S2. Let there be batches B1 and B2.
Each time S1 is used, it is added to B1. Each time S2 is used, it is added to B2. When commiting, S1 must be commited before S2 per foreign key constraint.
Problem occurs when
- T1 processes transmissions gleefully
- T2 processes transmissions innocently
- T1 uses statement S1 adding s1a to batch B1 containing s1a
- T1 uses statement S2 adding s2a to batch B2 containing s2a
- T1 decides it is time to commit
- T1 commits batch B1 containing s1a
- T2 uses S1 adding s1b to batch B1 containing s1b
- T2 uses S2 adding s2b to batch B2 containing s2a, s2b
- T1 commits batch B1 containting s2a, s2b
- Database says 'no no' as s2b is commited before s1b which is forbidden in the foreign key.
This can be avoided with manual synchronization as well as pointed in the answers but then I still have to track separately the size of each batch instead of applying logic local to each thread.