views:

412

answers:

2

We have a (currently InnoDB) table which contains roughly 500,000 rows. This represents a queue of tasks to run. It is stored in a MySQL database.

An a continual basis, at least once per second but sometimes more frequently, we select data from it and subsequently update some rows. Once per day, we prune old rows from the table.

We started getting deadlocks on the table and this brought our task processing to a standstill. These deadlocks were caused during the nightly prune run. The combination of DELETE, SELECT, and UPDATE meant that essentially nothing productive could happen. I unfortunately do not have the output of a SHOW ENGINE INNODB STATUS.

I'd like to know the best option for dealing with this. Note that our code detects deadlocks and reissues the query. Also, we long ago discovered that deleting all matching rows at once was too taxing on a database table that saw a lot of activity, so we LIMIT our deletes to 10,000 rows at a time and keep on reissuing the query until all necessary rows have been pruned.

I see the following options, and would like opinions on which are the best, or suggestions for other options:

  1. DELETE fewer rows at a time
  2. Use exponential backoff on our DELETEs, though I am concerned that this will not help given our specific workload
  3. LOCK TABLES as per MySQL documentation. We could probably accept blocking the SELECT and UPDATE statements for the duration of the deletes.
  4. Switch to MyISAM table type. We went with InnoDB because we were initially using transactions on this table. This is no longer the case. I'm not enough familiar with the specifics to know if this is a viable solution.
  5. Perhaps use UPDATE LOW_PRIORITY. May be that the DELETEs do not affect the SELECTs, only the UPDATEs, and this may be sufficient.
A: 

Make sure that your transaction isolation is marked as read committed and not repeatable read. Read commited should be the default, but we saw that in our server the innodb default was repeatable read.

You can check by running the following:

SHOW VARIABLES LIKE 'tx%';

In order to set this, in your my.cnf file enter the folloiwing line:

tx_isolation=READ-COMMITTED
David Rabinowitz
+2  A: 

When performing DML operations, InnoDB locks all rows scanned, not matched.

Consider this table layout:

DROP TABLE t_tran;

CREATE TABLE t_tran (id INT NOT NULL PRIMARY KEY, data INT NOT NULL, KEY ix_tran_data (data)) Engine=InnoDB;

DROP TABLE t_tran;

CREATE TABLE t_tran (id INT NOT NULL PRIMARY KEY, data INT NOT NULL, KEY ix_tran_data (data)) Engine=InnoDB;

INSERT
INTO    t_tran
VALUES
(1, 1),
(2, 2),
(3, 3),
(4, 4),
(5, 5),
(6, 6),
(7, 7),
(8, 8);

START TRANSACTION;

DELETE
FROM    t_tran
WHERE   data = 2
        AND id <= 5;

In this case, MySQL selects RANGE access path on id, which it considers cheaper than REF on data.

In a concurrent transaction, you will be able to delete or update rows 6, 7, 8 but not rows 1 to 5 since they are locked (despite the fact that only row 2 was affected).

If you remove id <= 5 from the condition above, you will be able to delete any row but row 3.

Unfortunately, you cannot control MySQL access paths in DML operations.

Best you can do is to index your conditions properly and hope that MySQL will pick these indexes.

Quassnoi
Thank you, we (my sysadmin and I) are of the opinion that this is actually probably the best fix. We've also reduced the amount of deleting and added exponential backoff. We'll see if that takes care of the problem.
ChrisInEdmonton