I have a program which reads a text file line by line, and creates a Hibernate entity object from each line, and saves them. I have several such text files to process, each of which has about 300,000 lines. I'm finding that my current implementation is excruciatingly slow, and I'm wondering if there's anything I can do to improve things.
My main method processes the text file line by line like so:
// read the file line by line
FileInputStream fileInputStream = new FileInputStream(new File(fileName));
InputStreamReader inputStreamReader = new InputStreamReader(fileInputStream);
BufferedReader bufferedReader = new BufferedReader(inputStreamReader);
int lineCount = 0;
String line = bufferedReader.readLine();
while (line != null)
{
// convert the line into an Observations object and persist it
convertAndPersistObservationsLine(line);
// if the number of lines we've processed has built up to the JDBC batch size then flush
// and clear the session in order to control the size of Hibernate's first level cache
lineCount++;
if (lineCount % JDBC_CACHE_SIZE == 0)
{
observationsDao.flush();
observationsDao.clear();
}
line = bufferedReader.readLine();
}
The convertAndPersistObservationsLine() method just splits the text line into tokens, creates a new entity object, populates the entity's fields with data from the tokens, and then saves the object via a DAO that calls Hibernate's Session.saveOrUpdate() method. The DAO methods flush() and clear() are direct calls to the corresponding Hibernate Session methods.
The Hibernate property 'hibernate.use_second_level_cache' is set to false, and the Hibernate property 'hibernate.jdbc.batch_size' is set to 50, as is the Java constant JDBC_CACHE_SIZE.
Can someone suggest a better way of going about this, or any tweaks to the above which may improve the performance of this batch loading program?
Thanks in advance for your help.
--James