Hi,
I am writing a full database extract program in java. Database is Oracle, and it is huge. Some tables have ~260 million records. The program should create one file per table in a specific format, so using Oracle datapump etc is not an option. Also, some company security policies do not allow to write a PL/SQL procedure to create files on DB server for this requirement. I have to go with Java and JDBC.
The issue I am facing is that Since files for some of the table is huge (~30GB) I am running out of memory almost every time even with a 20GB Java Heap. During the creation of file when the file size exceeds the heap size, even with one of the most aggressive GC policy, the process seems to hang-up. For example if the file size is > 20GB and heap size is 20GB, once heap utilization hits max heap size, its slows down writing 2MB per minute or so and at this speed, it will take months to get full extract.
I am looking for some way to overcome this issue. Any help would be greatly appreciated.
Here are some details of the system configuration I have: Java - JDK1.6.0_14
System config - RH Enterprise Linux (2.6.18) running on 4 X Intel Xeon E7450 (6 cores) @2.39GH
RAM - 32GB
Database Oracle 11g
file wirting part of the code goes below:
private void runQuery(Connection conn, String query, String filePath,
String fileName) throws SQLException, Exception {
PreparedStatement stmt = null;
ResultSet rs = null;
try {
stmt = conn.prepareStatement(query,
ResultSet.TYPE_SCROLL_INSENSITIVE,
ResultSet.CONCUR_READ_ONLY);
stmt.setFetchSize(maxRecBeforWrite);
rs = stmt.executeQuery();
// Write query result to file
writeDataToFile(rs, filePath + "/" + fileName, getRecordCount(
query, conn));
} catch (SQLException sqle) {
sqle.printStackTrace();
} finally {
try {
rs.close();
stmt.close();
} catch (SQLException ex) {
throw ex;
}
}
}
private void writeDataToFile(ResultSet rs, String tempFile, String cnt)
throws SQLException, Exception {
FileOutputStream fileOut = null;
int maxLength = 0;
try {
fileOut = new FileOutputStream(tempFile, true);
FileChannel fcOut = fileOut.getChannel();
List<TableMetaData> metaList = getMetaData(rs);
maxLength = getMaxRecordLength(metaList);
// Write Header
writeHeaderRec(fileOut, maxLength);
while (rs.next()) {
// Now iterate on metaList and fetch all the column values.
writeData(rs, metaList, fcOut);
}
// Write trailer
writeTrailerRec(fileOut, cnt, maxLength);
} catch (FileNotFoundException fnfe) {
fnfe.printStackTrace();
} catch (IOException ioe) {
ioe.printStackTrace();
} finally {
try {
fileOut.close();
} catch (IOException ioe) {
fileOut = null;
throw new Exception(ioe.getMessage());
}
}
}
private void writeData(ResultSet rs, List<TableMetaData> metaList,
FileChannel fcOut) throws SQLException, IOException {
StringBuilder rec = new StringBuilder();
String lf = "\n";
for (TableMetaData tabMeta : metaList) {
rec.append(getFormattedString(rs, tabMeta));
}
rec.append(lf);
ByteBuffer byteBuf = ByteBuffer.wrap(rec.toString()
.getBytes("US-ASCII"));
fcOut.write(byteBuf);
}
private String getFormattedString(ResultSet rs, TableMetaData tabMeta)
throws SQLException, IOException {
String colValue = null;
// check if it is a CLOB column
if (tabMeta.isCLOB()) {
// Column is a CLOB, so fetch it and retrieve first clobLimit chars.
colValue = String.format("%-" + tabMeta.getColumnSize() + "s",
getCLOBString(rs, tabMeta));
} else {
colValue = String.format("%-" + tabMeta.getColumnSize() + "s", rs
.getString(tabMeta.getColumnName()));
}
return colValue;
}