tags:

views:

512

answers:

3

Hi, I'm using the following code

 st = connection.createStatement(
            ResultSet.CONCUR_READ_ONLY,
   ResultSet.FETCH_FORWARD,
   ResultSet.TYPE_FORWARD_ONLY
    );
 st.setFetchSize(1000);
 System.out.println("start query ");
 rs = st.executeQuery(queryString);
 System.out.println("done query");

The query return a lot of (800k) rows and it take a large time (~2m) between printing "start query" and "done query". When I manually put an "limit 10000" in my query there's no time between "start" and "done". Processing the results takes time so I guess it's overall faster if it just fetches 1k rows from the database, processes those and when it's running out of rows it can get new ones in the background.

The ResultsSet.CONCUR_READ_ONLY etc where my last guess; am I missing something?

(it's a postgresql 8.3 server)

A: 

This will depend on your driver. From the docs:

Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed. The number of rows specified affects only result sets created using this statement. If the value specified is zero, then the hint is ignored. The default value is zero.

Note that it says "a hint" - I would take that to mean that a driver can ignore the hint if it really wants to... and it sounds like that's what's happening.

Jon Skeet
Yeah I read that but couldn't believe that JDBC + postgreSQL would ignore that, it's around for ages.
kresjer
@kresjer: Its postgresql so you should have access to the source for the DB and the JDBC drivers. You could use it to figure out what is actually going on ...
Stephen C
+2  A: 

Try turning auto-commit off:

// make sure autocommit is off
connection.setAutoCommit(false);

 st = connection.createStatement();
 st.setFetchSize(1000);
 System.out.println("start query ");
 rs = st.executeQuery(queryString);
 System.out.println("done query");

Reference

dogbane
A: 

The two queries do entirely different things.

Using the LIMIT clause limits the size of the result set to 10000, while setting the fetch size does not, it instead gives a hint to the driver saying how many rows to fetch at a time when iterating through the result set - which includes all 800k rows.

So when using setFetchSize, the database creates the full result set, that's why it's taking so long.

Edit for clarity: Setting the fetch size does nothing unless you iterate through the result (see Jon's comment), but creating a much smaller result set via LIMIT makes a great difference.

Henning
But he *isn't* iterating through the result set. The idea is that *while* you iterate through the result set, ideally it should fetch 1000 results over the network, then you can process those, and when you get to the 1001st it will then fetch the next 1000 etc.
Jon Skeet
Of course. The difference in speed is because of differenr sizes of the result sets, 10000 vs. 800000.
Henning