Hi,
I am consuming a high rate data stream and doing the following steps to store data in a MySQL database. For each new arriving item.
- (1) Parse incoming item.
- (2) Execute several "INSERT ... ON DUPLICATE KEY UPDATE"
I have used INSERT ... ON DUPLICATE KEY UPDATE to eliminate one additional round-trip to the database.
While trying to improve the overall performance, I have considered doing bulk updates in the following way:
- (1) Parse incoming item.
- (2) Generate SQL statement with "INSERT ... ON DUPLICATE KEY UPDATE" and append to a file.
Periodically flush the SQL statements in the file to the database.
Two questions:
- (1) will this have a positive impact in the database load?
- (2) how should I flush the statements to the database so that indices are only reconstructed after the complete flush? (using transactions?)
UPDATE: I am using Perl DBI + MySQL MyISAM.
Thanks in advance for any comments.