ansaurus

Question

MySQL UPDATE statement batching to avoid massive TRX sizes

Answer 1

+1 A:

Maybe not the answer you are looking for, but you could simplify your code a bit by using LIMIT in the update.

Pseudo-code:

do {
  update table1 set col2 = 'flag set' where col2 = 'flat not set' LIMIT 10000
} while (ROW_COUNT() > 0)

Eric Petroelje 2009-12-29 21:51:47

Interesting approach Eric. I like the idea, but it is limited to updates that actually modify the field values that are set in the WHERE clause (such as the example I gave). However, I have the same problem with large updates that don't fit this pattern (e.g. set col1="foo" where col2="bar").

Matthew Quinlan 2010-01-06 15:10:58

@Matthew - if that's the case, you could always modify your query to be something like "set col1='foo' where col2='bar' and col1 != 'foo'" which would avoid the issue of a batch of non-modifying updates kicking you out of the loop.

Eric Petroelje 2010-01-06 15:16:18

Answer 2

A:

Yes Eric... you're right (for simple cases that don't include where clauses like "where not in"). I've written a small procedure that allows me to provide an SQL update statement and a limit number as parameters.

create procedure mass_update (IN updatestmt TEXT, IN batchsiz INT) BEGIN -- PURPOSE: break down large update statements into batches to limit transaction size and reduce contention -- LIMITATIONS: only works with UPDATEs that would give "0 rows affected" when executed TWICE!

SET @sql = CONCAT( updatestmt," LIMIT ", batchsiz );
-- had to use CONCAT because "PREPARE stmt FROM" cannot accept dynamic LIMIT parameter
-- reference: http://forums.mysql.com/read.php?98,75640,75640#msg-75640
PREPARE stmt FROM @sql;

select @sql; --display SQL to screen
SET @cumrowcount=0;
SET @batchnum=0;
SET @now := now(); -- @now is a STRING variable... not a datetime

    increment: repeat
        SET @batchnum=@batchnum+1;
        EXECUTE stmt;
        set @rowcount = ROW_COUNT();
        set @cumrowcount = @cumrowcount + @rowcount;
        select @batchnum as "Iteration",
               @cumrowcount as "Cumulative Rows",
               TIMESTAMPDIFF(SECOND,STR_TO_DATE(@now,"%Y-%m-%d %H:%i:%s"),now()) as "Cumulative Seconds",
               now() as "Timestamp";
        until @rowcount <= 0
    end repeat increment;

    DEALLOCATE PREPARE stmt;  -- REQUIRED
END

This seems to work fairly well and I can run it with any old UPDATE statement which adheres to the "running twice results in 0 rows affected" rule.

Thanks for the idea Eric!

Matthew Quinlan 2010-01-06 18:00:09

ansaurus

tags:

views:

answers:

MySQL UPDATE statement batching to avoid massive TRX sizes

related questions