I have a MyISAM table with more than 10^7 rows. When adding data to it, I have to update ~10 rows at the end. Is it faster to delete them and then insert the new ones, or is it faster to update those rows? Data that should be updated is not part of the index. What about index/data fragmentation
It is faster to update. You can also use INSERT ON DUPLICATE KEY UPDATE
INSERT INTO table (a,b,c) VALUES (1,2,3)
ON DUPLICATE KEY UPDATE c=c+1;
For more details read update documentation
Logically DELETE+ADD = 2 actions, UPDATE = 1 action. Also deleting and adding new changes records IDs on auto_increment, so if those records have relationships that would be broken, or would need updates too. I'd go for UPDATE.
UPDATE
is by far much faster.
When you UPDATE
, the table records are just being rewritten with new data.
When you DELETE
, the indexes should be updated (remember, you delete the whole row, not only the columns you need to modify) and datablocks may be moved (if you hit the PCTFREE
limit)
And all this must be done again on INSERT
.
That's why you should always use
INSERT ... ON DUPLICATE KEY UPDATE
instead of REPLACE
.
The former one is an UPDATE
operation in case of a key violation, while the latter one is DELETE
/ INSERT
.
using an update where Column='something' should use an index as long as the search criteria is in the index (whether it's a seek or scan is a completely different issue).
if you are doing these updates a lot but dont' have an index on the criteria column, i would recommend creating an index on the column that you are using. that should help speed things up.
Rather than deleting or updating data for the sake of performance, I would consider partitioning.
http://dev.mysql.com/doc/refman/5.1/en/partitioning-range.html
This will allow you to retain the data historically and not degrade performance.