views:

297

answers:

2

Hi,

I have to deal with a large result set (could be hundreds thousands of rows, sometimes more).
They unfortunately need to be retrieved all at once (on start up).

I'm trying to do that by using as less memory as possible.
By looking on SO I've found that using SSCursor might be what I'm looking for,
but I still don't really know how to exactly use them.

Is doing a fetchall() from a base cursor or a SScursor the same (in term of memory usage) ?
Can I 'stream' from the sscursor my rows one by one (or a few by a few) and if yes,
what is the best way to do so ?

Thanks in advance :-)

+1  A: 

Definitely use the SSCursor when fetching big result sets. It made a huge difference for me when I had a similar problem. You can use it like this:

import MySQLdb
import MySQLdb.cursors

connection = MySQLdb.connect(
        host=host, port=port, user=username, passwd=password, db=database, 
        cursorclass=MySQLdb.cursors.SSCursor) # put the cursorclass here
cursor = connection.cursor()

Now you can execute your query with cursor.execute() and use the cursor as an iterator.

Edit: removed unnecessary homegrown iterator, thanks Denis!

Otto Allmendinger
Cursor object is iterable, so no need to write generator over it. Otherwise you can use `iter(cursor.fetchone, None)`.
Denis Otkidach
+3  A: 

I am in agreement with Otto Allmendinger's answer, but to make explicit Denis Otkidach's comment, here is how you can iterate over the results without using Otto's fetch() function:

import MySQLdb.cursors
connection=MySQLdb.connect(
    host="thehost",user="theuser",
    passwd="thepassword",db="thedb",
    cursorclass = MySQLdb.cursors.SSCursor)
cursor=connection.cursor()
cursor.execute(query)
for row in cursor:
    print(row)
unutbu
I guess that was what I was looking for, thanks
Sylvain