tags:

views:

158

answers:

1

Hi im using sqlalchemy on a db2 table with 500k rows.

using plain sql like this:

 sql="select * from test.test"
 result=Session.execute(sql)
 for row in result:
      pdic[row.id]=row.val1

this takes 5min

if i use ibm_db :

 sql="select * from test.test"
 stmt = ibm_db.exec_immediate(ibm_db_conn,sql)
 result =ibm_db.fetch_both(stmt)   
 while(result):
         pathdic[result['ID']]=result['VAL']
         result = ibm_db.fetch_both(stmt)   

this takes less than 30 sec

Any idea?

+1  A: 

If you're using DB2 for Linux, UNIX, and Windows, there are sophisticated tracing facilities called event monitors that are built into the database for capturing detailed information about the SQL workload your application is sending. If SQLAlchemy is accessing DB2 inefficiently, you'll see a different series of events captured by the statement event monitor. The other possibility is that both versions of the program are going after the DB2 data in roughly the same manner, but SQLAlchemy is spending more "out of DB2" time allocating internal objects to contain the results. I use statement event monitors that write to tables so I can search for all kinds of issues and patterns, so the link I included is to a DB2 utility that greatly simplifies the act of defining the event monitor and the tables that will contain its output. After that, you'll just need to

SET EVENT MONITOR YourMonitorName STATE 1

to start it, and

SET EVENT MONITOR YourMonitorName STATE 0

to turn it off. The turning it off part is very important, since every single SQL statement executed while the monitor is on will generate 3 to 5 rows of data in the event monitor table.

Fred Sobotka