views:

121

answers:

1

This loop checks if a record is in the sqlite database and builds a list of dictionaries for those records that are missing and then executes a multiple insert statement with the list. This works but it is very slow (at least i think it is slow) as it takes 5 minutes to loop over 3500 queries. I am a complete newbie in python, sqlite and sqlalchemy so I wonder if there is a faster way of doing this.

list_dict = []

session = Session()

for data in data_list:
    if session.query(Class_object).filter(Class_object.column_name_01 == data[2]).filter(Class_object.column_name_00 == an_id).count() == 0:
        list_dict.append({'column_name_00':a_id,
                          'column_name_01':data[2]})

conn = engine.connect()
conn.execute(prices.insert(),list_dict)
conn.close()
session.close()

edit: I moved session = Session() outside the loop. Did not make a difference.

SOLUTION:

thanks to mcabral answer I modified the code as:

existing_record_list = []
list_dict = []

conn = engine.connect()
s = select([prices.c.column_name_01], prices.c.column_name_00==a_id)
result = conn.execute(s) 
for row in result:       
    existing_record_list.append(row[0])

for data in raw_data['data']:
    if data[2] not in existing_record_list:
        list_dict.append({'column_name_00':a_id,
                          'column_name_01':data[2]}

conn = engine.connect()
conn.execute(prices.insert(),list_dict)
conn.close()

This now takes 6 seconds. That is some improvement!!

+3  A: 

3500 queries seems a big number,

Have you consider fetching all entities in one query? Then you will be iterating over a list in memory, and not querying the database for each item.

mcabral
Thanks. I attached to the question a solution inspired by your suggestion.
LtPinback