tags:

views:

539

answers:

1

I am working with a database of a million rows approx.. using python to parse documents and populate the table with terms.. The insert statements work fine but the update statements get extremely time consuming as the table size grows..

It would be great if some can explain this phenomenon and also tell me if there is a faster way to do updates.

Thanks, Arnav

+4  A: 

Sounds like you have an indexing problem. Whenever I hear about problems getting worse as table size grows, it makes me wonder if you're doing a table scan whenever you interact with a table.

Check to see if you have a primary key and meaningful indexes on that table. Look at the WHERE clause you have on that UPDATE and make sure there's an index on those columns to make finding that record as fast as possible.

UPDATE: Write a SELECT query using the WHERE clause you use to UPDATE and ask the database engine to EXPLAIN PLAN. If you see a TABLE SCAN, you'll know what to do.

duffymo
also check performance when SELECTing the same data that you are UPDATing (with the same WHERE clauses, etc.)
streetpc
Even better: begin; explain analyze update .... ; rollback;
Milen A. Radev