views:

425

answers:

2

I'm developing a spring application that uses large MySQL tables. When loading large tables, I get an OutOfMemoryException, since the driver tries to load the entire table into application memory.

I tried using

statement.setFetchSize(Integer.MIN_VALUE);

but then every ResultSet I open hangs on close(); looking online I found that that happens because it tries loading any unread rows before closing the ResultSet, but that is not the case since I do this:

ResultSet existingRecords = getTableData(tablename);
try {
    while (existingRecords.next()) {
        // ...
    }
} finally {
    existingRecords.close(); // this line is hanging, and there was no exception in the try clause
}

The hangs happen for small tables (3 rows) as well, and if I don't close the RecordSet (which happened in one method) then connection.close() hangs.


Stack trace of the hang:

SocketInputStream.socketRead0(FileDescriptor, byte[], int, int, int) line: not available [native method]
SocketInputStream.read(byte[], int, int) line: 129
ReadAheadInputStream.fill(int) line: 113
ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(byte[], int, int) line: 160
ReadAheadInputStream.read(byte[], int, int) line: 188
MysqlIO.readFully(InputStream, byte[], int, int) line: 2428 MysqlIO.reuseAndReadPacket(Buffer, int) line: 2882
MysqlIO.reuseAndReadPacket(Buffer) line: 2871
MysqlIO.checkErrorPacket(int) line: 3414
MysqlIO.checkErrorPacket() line: 910
MysqlIO.nextRow(Field[], int, boolean, int, boolean, boolean, boolean, Buffer) line: 1405
RowDataDynamic.nextRecord() line: 413
RowDataDynamic.next() line: 392 RowDataDynamic.close() line: 170
JDBC4ResultSet(ResultSetImpl).realClose(boolean) line: 7473 JDBC4ResultSet(ResultSetImpl).close() line: 881 DelegatingResultSet.close() line: 152
DelegatingResultSet.close() line: 152
DelegatingPreparedStatement(DelegatingStatement).close() line: 163
(This is my class) Database.close() line: 84

+1  A: 

Only setting the fetch size is not the correct approach. The javadoc of Statement#setFetchSize() already states the following:

Gives the JDBC driver a hint as to the number of rows that should be fetched from the database

The driver is actually free to apply or ignore the hint. Some drivers ignore it, some drivers applies it directly, some drivers needs more parameters. The MySQL JDBC driver falls in the last category. If you check the MySQL JDBC driver documentation, you'll see the following information (scroll about 2/3 down until header ResultSet):

To enable this functionality, you need to create a Statement instance in the following manner:

stmt = conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY, java.sql.ResultSet.CONCUR_READ_ONLY);
stmt.setFetchSize(Integer.MIN_VALUE);

Please read the entire section of the document, it describes the caveats of this approach as well. Here's a relevant cite:

There are some caveats with this approach. You will have to read all of the rows in the result set (or close it) before you can issue any other queries on the connection, or an exception will be thrown.

(...)

If the statement is within scope of a transaction, then locks are released when the transaction completes (which implies that the statement needs to complete first). As with most other databases, statements are not complete until all the results pending on the statement are read or the active result set for the statement is closed.

If that doesn't fix the OutOfMemoryError (not Exception), then the problem is likely that you're storing all the data in Java's memory instead of processing it immediately as soon as the data comes in. This would require more changes in your code, maybe a complete rewrite. I've answered similar question before here.

BalusC
This seems oddly familiar; it's a copy+paste of your answer in http://stackoverflow.com/questions/2095490/. And it's irrelevant since it tells me to do what I already said I did and caused me problems - in the question itself.
configurator
Why am I not allowed to copypaste the relevant parts of my own words? Further you didn't tell that you created the statement as per the MySQL JDBC document. I've seen too often that ones made the mistake to set *only* the FetchSize.
BalusC
@configurator you seem to be scolding BalusC for giving you an answer (together with your downvote). +1 since I was going to answer the same thing (about the `TYPE_FORWARD_ONLY`)
Bozho
@Bozho: thanks. @configuratior: where's your respons? Anyway, I just wanted to say sorry if you don't value copypastes or so, but I couldn't find the specific MySQL link within a sec, but I can find my answer within a sec, so I used it and I thought it was easy to copypaste part of it as I was going to type almost exactly the same answer. If your downvote is actually because you hate me for other (obvious?) reasons, then please say so. I have no problem to ignore you in your future questions :)
BalusC
@Balus: I don't hate you; I don't even know you.The downvote was because you told me to do the same thing I already did; the createStatement parameters you specified are even the default values. Never mind, I've retracted my downvote since you don't deserve it.
configurator
Fair enough. How about the connection being part of a bigger picture, e.g. a transaction? The docs also state that this ain't going to work. How about processing of the data? It should not be kept entirely in Java's memory.
BalusC
Streaming actually works; only closing the ResultSets is the issue, which has just been solved. You can see the solution in my answer.
configurator
A: 

Don't close your ResultSets twice.

Apparently, when closing a Statement it attempts to close the corresponding ResultSet, as you can see in these two lines from the stack trace:

DelegatingResultSet.close() line: 152
DelegatingPreparedStatement(DelegatingStatement).close() line: 163

I had thought the hang was in ResultSet.close() but it was actually in Statement.close() which calls ResultSet.close(). Since the ResultSet was already closed, it just hung.

We've replaced all ResultSet.close() with results.getStatement().close() and removed all Statement.close()s, and the problem is now solved.

configurator
Glad you fixed it. However, the normal JDBC idiom is to close the resources in **reversed** order as you acquired them. Some JDBC drivers (including MySQL one) indeed implicitly tries to close any opened "child" resources. Thus, after opening Connection, Statement and ResultSet in this order you need to close ResultSet, Statement and Connection in this order. You shouldn't leave the statement open, it may leak resources, especially when you're using connection pooling wherein the actual connection won't be directly closed. BTW: I find it odd that it threw OOME instead of SQLException.
BalusC
The OOME was when not streaming; when streaming it just hung on close. The 'proper' way to do this is the first thing we tried, and the one that didn't work; we closed ResultSet, then got hung closing the statement (which tried reclosing the ResultSet). Now we only close the statement and hopefully we won't get memory leaks. Thanks for your tips
configurator