views:

336

answers:

2

I am writing a python script that will be doing some processing on text files. As part of that process, i need to import each line of the tab-separated file into a local MS SQL Server (2008) table. I am using pyodbc and I know how to do this. However, I have a question about the best way to execute it.

I will be looping through the file, creating a cursor.execute(myInsertSQL) for each line of the file. Does anyone see any problems waiting to commit the statements until all records have been looped (i.e. doing the commit() after the loop and not inside the loop after each individual execute)? The reason I ask is that some files will have upwards of 5000 lines. I didn't know if trying to "save them up" and committing all 5000 at once would cause problems.

I am fairly new to python, so I don't know all of these issues yet.

Thanks.

A: 

If I understand what you are doing, Python is not going to be a problem. Executing a statement inside a transaction does not create cumulative state in Python. It will do so only at the database server itself.

When you commit you will need to make sure the commit occurred, since having a large batch commit may conflict with intervening changes in the database. If the commit fails, you will have to re-run the batch again.

That's the only problem that I am aware of with large batches and Python/ODBC (and it's not even really a Python problem, since you would have that problem regardless.)

Now, if you were creating all the SQL in memory, and then looping through the memory-representation, that might make more sense. Still, 5000 lines of text on a modern machine is really not that big of a deal. If you start needing to process two orders of magnitude more, you might need to rethink your process.

Christopher
Thanks for the insight. I think this info, combined with the executemany idea above will be what I need to think about.
Bryan Lewis
A: 

Create a file and use BULK INSERT. It will be faster.

nosklo
I have not used BULK INSERT before, but it is well worth looking into. Since I have to have this script done tomorrow, I will take the first approach for now, and revisit BULK INSERT when I have more time. Thanks.
Bryan Lewis
BULK INSERT may require additional privileges on the database. I know I had to ask our sys admin for them.
tgray