MySQL ResultSets are by default retrieved completely from the server before any work can be done. In cases of huge result sets this becomes unusable. I would like instead to actually retrieve the rows one by one from the server.
In Java, following the instructions here (under "ResultSet"), I create a statement like this:
stmt = conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY,
java.sql.ResultSet.CONCUR_READ_ONLY);
stmt.setFetchSize(Integer.MIN_VALUE);
This works nicely in Java. My question is: is there a way to do the same in python?
One thing I tried is to limit the query to a 1000 rows at a time, like this:
start_row = 0
while True:
cursor = conn.cursor()
cursor.execute("SELECT item FROM items LIMIT %d,1000" % start_row)
rows = cursor.fetchall()
if not rows:
break
start_row += 1000
# Do something with rows...
However, this seems to get slower the higher start_row is.
And no, using fetchone()
instead of fetchall()
doesn't change anything.
Clarification:
The naive code I use to reproduce this problem looks like this:
import MySQLdb
conn = MySQLdb.connect(user="user", passwd="password", db="mydb")
cur = conn.cursor()
print "Executing query"
cur.execute("SELECT * FROM bigtable");
print "Starting loop"
row = cur.fetchone()
while row is not None:
print ", ".join([str(c) for c in row])
row = cur.fetchone()
cur.close()
conn.close()
On a ~700,000 rows table, this code runs quickly. But on a ~9,000,000 rows table it prints "Executing Query" and then hangs for a long long time. That is why it makes no difference if I use fetchone()
or fetchall()
.