views:

41

answers:

1

The typical documentation on locking in innodb is way too confusing. I think it will be of great value to have a "dummies guide to innodb locking"

I will start, and I will gather all responses as a wiki:

  • The column needs to be indexed before row level locking applies.
    • EXAMPLE: delete row where column1=10; will lock up the table unless column1 is indexed
+1  A: 

Here are my notes from working with MySQL support on a recent, strange locking issue (version 5.1.37):

All rows and index entries traversed to get to the rows being changed will be locked. It's covered at:

http://dev.mysql.com/doc/refman/5.1/en/innodb-locks-set.html

"A locking read, an UPDATE, or a DELETE generally set record locks on every index record that is scanned in the processing of the SQL statement. It does not matter whether there are WHERE conditions in the statement that would exclude the row. InnoDB does not remember the exact WHERE condition, but only knows which index ranges were scanned. ... If you have no indexes suitable for your statement and MySQL must scan the entire table to process the statement, every row of the table becomes locked, which in turn blocks all inserts by other users to the table."

That is a MAJOR headache if true.

It is. A workaround that is often helpful is to do:

UPDATE whichevertable set whatever to something where primarykey in (select primarykey from whichevertable where constraints order by primarykey);

The inner select doesn't need to take locks and the update will then have less work to do for the updating. The order by clause ensures that the update is done in primary key order to match InnoDB's physical order, the fastest way to do it.

Where large numbers of rows are involved, as in your case, it can be better to store the select result in a temporary table with a flag column added. Then select from the temporary table where the flag is not set to get each batch. Run updates with a limit of say 1000 or 10000 and set the flag for the batch after the update. The limits will keep the amount of locking to a tolerable level while the select work will only have to be done once. Commit after each batch to release the locks.

You can also speed this work up by doing a select sum of an unindexed column before doing each batch of updates. This will load the data pages into the buffer pool without taking locks. Then the locking will last for a shorter timespan because there won't be any disk reads.

This isn't always practical but when it is it can be very helpful. If you can't do it in batches you can at least try the select first to preload the data, if it's small enough to fit into the buffer pool.

If possible use the READ COMMITTED transaction isolation mode. See:

http://dev.mysql.com/doc/refman/5.1/en/set-transaction.html

To get that reduced locking requires use of row-based binary logging (rather than the default statement based binary logging).

Two known issues:

  1. Subqueries can be less than ideally optimised sometimes. In this case it was an undesirable dependent subquery - the suggestion I made to use a subquery turned out to be unhelpful compared to the alternative in this case because of that.

  2. Deletes and updates do not have the same range of query plans as select statements so sometimes it's hard to properly optimise them without measuring the results to work out exactly what they are doing.

Both of these are gradually improving. This bug is one example where we've just improved the optimisations available for an update, though the changes are significant and it's still going through QA to be sure it doesn't have any great adverse effects:

http://bugs.mysql.com/bug.php?id=36569

Eric J.