views:

499

answers:

14

I have a query that is returning in vastly different amounts of time between 2 datasets. For one set (database A) it returns in a few seconds, for the other (database B)....well I haven't waited long enough yet, but over 10 minutes. I have dumped both of these databases to my local machine where I can reproduce the issue running MySQL 5.1.37.

Curiously, database B is smaller than database A.

A stripped down version of the query that reproduces the problem is:

SELECT * FROM po_shipment ps 
JOIN po_shipment_item psi USING (ship_id) 
JOIN po_alloc pa ON ps.ship_id = pa.ship_id AND pa.UID_items = psi.UID_items 
JOIN po_header ph ON pa.hdr_id = ph.hdr_id 
LEFT JOIN EVENT_TABLE ev0 ON ev0.TABLE_ID1 = ps.ship_id AND ev0.EVENT_TYPE = 'MAS0' 
LEFT JOIN EVENT_TABLE ev1 ON ev1.TABLE_ID1 = ps.ship_id AND ev1.EVENT_TYPE = 'MAS1' 
LEFT JOIN EVENT_TABLE ev2 ON ev2.TABLE_ID1 = ps.ship_id AND ev2.EVENT_TYPE = 'MAS2' 
LEFT JOIN EVENT_TABLE ev3 ON ev3.TABLE_ID1 = ps.ship_id AND ev3.EVENT_TYPE = 'MAS3' 
LEFT JOIN EVENT_TABLE ev4 ON ev4.TABLE_ID1 = ps.ship_id AND ev4.EVENT_TYPE = 'MAS4' 
LEFT JOIN EVENT_TABLE ev5 ON ev5.TABLE_ID1 = ps.ship_id AND ev5.EVENT_TYPE = 'MAS5' 
WHERE ps.eta >= '2010-03-22'
GROUP BY ps.ship_id
LIMIT 100;

The EXPLAIN query plan for the first database (A) that returns in ~2 seconds is:

+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+------------------------------+------+----------------------------------------------+
| id | select_type | table | type   | possible_keys                                                                                                                          | key                              | key_len | ref                          | rows | Extra                                        |
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+------------------------------+------+----------------------------------------------+
|  1 | SIMPLE      | ps    | range  | PRIMARY,IX_ETA_DATE                                                                                                                    | IX_ETA_DATE                      | 4       | NULL                         |  174 | Using where; Using temporary; Using filesort | 
|  1 | SIMPLE      | ev0   | ref    | IX_EVENT_ID_EVENT_TYPE                                                                                                                 | IX_EVENT_ID_EVENT_TYPE           | 36      | UNIVIS_PROD.ps.ship_id,const |    1 |                                              | 
|  1 | SIMPLE      | ev1   | ref    | IX_EVENT_ID_EVENT_TYPE                                                                                                                 | IX_EVENT_ID_EVENT_TYPE           | 36      | UNIVIS_PROD.ps.ship_id,const |    1 |                                              | 
|  1 | SIMPLE      | ev2   | ref    | IX_EVENT_ID_EVENT_TYPE                                                                                                                 | IX_EVENT_ID_EVENT_TYPE           | 36      | UNIVIS_PROD.ps.ship_id,const |    1 |                                              | 
|  1 | SIMPLE      | ev3   | ref    | IX_EVENT_ID_EVENT_TYPE                                                                                                                 | IX_EVENT_ID_EVENT_TYPE           | 36      | UNIVIS_PROD.ps.ship_id,const |    1 |                                              | 
|  1 | SIMPLE      | ev4   | ref    | IX_EVENT_ID_EVENT_TYPE                                                                                                                 | IX_EVENT_ID_EVENT_TYPE           | 36      | UNIVIS_PROD.ps.ship_id,const |    1 |                                              | 
|  1 | SIMPLE      | ev5   | ref    | IX_EVENT_ID_EVENT_TYPE                                                                                                                 | IX_EVENT_ID_EVENT_TYPE           | 36      | UNIVIS_PROD.ps.ship_id,const |    1 |                                              | 
|  1 | SIMPLE      | psi   | ref    | PRIMARY,IX_po_shipment_item_po_shipment1,FK_po_shipment_item_po_shipment1                                                              | IX_po_shipment_item_po_shipment1 | 4       | UNIVIS_PROD.ps.ship_id       |    1 |                                              | 
|  1 | SIMPLE      | pa    | ref    | IX_po_alloc_po_shipment_item2,IX_po_alloc_po_details_old,FK_po_alloc_po_shipment1,FK_po_alloc_po_shipment_item1,FK_po_alloc_po_header1 | FK_po_alloc_po_shipment1         | 4       | UNIVIS_PROD.psi.ship_id      |    5 | Using where                                  | 
|  1 | SIMPLE      | ph    | eq_ref | PRIMARY,IX_HDR_ID                                                                                                                      | PRIMARY                          | 4       | UNIVIS_PROD.pa.hdr_id        |    1 |                                              | 
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+------------------------------+------+----------------------------------------------+

The EXPLAIN query plan for the second database (B) that returns in >600 seconds is:

+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+--------------------------------+------+----------------------------------------------+
| id | select_type | table | type   | possible_keys                                                                                                                          | key                              | key_len | ref                            | rows | Extra                                        |
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+--------------------------------+------+----------------------------------------------+
|  1 | SIMPLE      | ps    | range  | PRIMARY,IX_ETA_DATE                                                                                                                    | IX_ETA_DATE                      | 4       | NULL                           |   38 | Using where; Using temporary; Using filesort | 
|  1 | SIMPLE      | psi   | ref    | PRIMARY,IX_po_shipment_item_po_shipment1,FK_po_shipment_item_po_shipment1                                                              | IX_po_shipment_item_po_shipment1 | 4       | UNIVIS_DEV01.ps.ship_id        |    1 |                                              | 
|  1 | SIMPLE      | ev0   | ref    | IX_EVENT_ID_EVENT_TYPE                                                                                                                 | IX_EVENT_ID_EVENT_TYPE           | 36      | UNIVIS_DEV01.psi.ship_id,const |    1 |                                              | 
|  1 | SIMPLE      | ev1   | ref    | IX_EVENT_ID_EVENT_TYPE                                                                                                                 | IX_EVENT_ID_EVENT_TYPE           | 36      | UNIVIS_DEV01.psi.ship_id,const |    1 |                                              | 
|  1 | SIMPLE      | ev2   | ref    | IX_EVENT_ID_EVENT_TYPE                                                                                                                 | IX_EVENT_ID_EVENT_TYPE           | 36      | UNIVIS_DEV01.ps.ship_id,const  |    1 |                                              | 
|  1 | SIMPLE      | ev3   | ref    | IX_EVENT_ID_EVENT_TYPE                                                                                                                 | IX_EVENT_ID_EVENT_TYPE           | 36      | UNIVIS_DEV01.psi.ship_id,const |    1 |                                              | 
|  1 | SIMPLE      | ev4   | ref    | IX_EVENT_ID_EVENT_TYPE                                                                                                                 | IX_EVENT_ID_EVENT_TYPE           | 36      | UNIVIS_DEV01.psi.ship_id,const |    1 |                                              | 
|  1 | SIMPLE      | ev5   | ref    | IX_EVENT_ID_EVENT_TYPE                                                                                                                 | IX_EVENT_ID_EVENT_TYPE           | 36      | UNIVIS_DEV01.ps.ship_id,const  |    1 |                                              | 
|  1 | SIMPLE      | pa    | ref    | IX_po_alloc_po_shipment_item2,IX_po_alloc_po_details_old,FK_po_alloc_po_shipment1,FK_po_alloc_po_shipment_item1,FK_po_alloc_po_header1 | IX_po_alloc_po_shipment_item2    | 4       | UNIVIS_DEV01.ps.ship_id        |    4 | Using where                                  | 
|  1 | SIMPLE      | ph    | eq_ref | PRIMARY,IX_HDR_ID                                                                                                                      | PRIMARY                          | 4       | UNIVIS_DEV01.pa.hdr_id         |    1 |                                              | 
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+--------------------------------+------+----------------------------------------------+

When database B is running I can look at the MySQL Administrator and the state remains at "Copying to tmp table" indefinitely. Database A also has this state but for only a second or so.

There are no differences in the table structure, indexes, keys etc between these databases (I have done show create tables and diff'd them).

The sizes of the tables are:

database A:
po_shipment 1776
po_shipment_item 1945
po_alloc 36298
po_header 71642
EVENT_TABLE 1608

database B:
po_shipment 463
po_shipment_item 470
po_alloc 3291
po_header 56149
EVENT_TABLE 1089

Some points to note:

  • Removing the WHERE clause makes the query return < 1 sec.
  • Removing the GROUP BY makes the query return < 1 sec.
  • Removing ev5, ev4, ev3 etc makes the query get faster for each one removed.

UPDATE after AJ's answer: - The size of the ship_id is significantly larger on database B (max value = 800002752) than database A (max value = 3489). Given that these are InnoDB tables would changing any buffer help with handling keys of this size? Further Update to this: I reduced the size of the keys and re-ANALYZEd but still no change in performance.

UPDATE desc of EVENT_TABLE:

Note that it is identical in both databases

+--------------------+--------------+------+-----+---------+----------------+
| Field              | Type         | Null | Key | Default | Extra          |
+--------------------+--------------+------+-----+---------+----------------+
| EVENT_TABLE_ID     | bigint(20)   | NO   | PRI | NULL    | auto_increment | 
| EVENT_TYPE         | varchar(10)  | NO   |     | NULL    |                | 
| TABLE_ID1          | int(11)      | NO   | MUL | NULL    |                | 
| TABLE_ID2          | int(11)      | YES  |     | NULL    |                | 
| TABLE_ID3          | int(11)      | YES  |     | NULL    |                | 
| TABLE_ID4          | int(11)      | YES  |     | NULL    |                | 
| EVENT_CREATED_DATE | datetime     | NO   |     | NULL    |                | 
| MESSAGE_REF        | varchar(100) | YES  |     | NULL    |                | 
+--------------------+--------------+------+-----+---------+----------------+

And for good measure the SHOW CREATE TABLE EVENT_TABLE:

The only thing to differ in this between databases is the auto increment value

| EVENT_TABLE | CREATE TABLE `EVENT_TABLE` (
  `EVENT_TABLE_ID` bigint(20) NOT NULL AUTO_INCREMENT,
  `EVENT_TYPE` varchar(10) NOT NULL,
  `TABLE_ID1` int(11) NOT NULL,
  `TABLE_ID2` int(11) DEFAULT NULL,
  `TABLE_ID3` int(11) DEFAULT NULL,
  `TABLE_ID4` int(11) DEFAULT NULL,
  `EVENT_CREATED_DATE` datetime NOT NULL,
  `MESSAGE_REF` varchar(100) DEFAULT NULL,
  PRIMARY KEY (`EVENT_TABLE_ID`),
  KEY `IX_EVENT_ID_EVENT_TYPE` (`TABLE_ID1`,`EVENT_TYPE`)
) ENGINE=InnoDB AUTO_INCREMENT=1925 DEFAULT CHARSET=utf8 | 

Can anyone suggest how to resolve this issue? What have I missed?

UPDATE after question from Michael Holzmann Here is the new Query plans based on his updated STRAIGHT_JOIN query. Note that database B has "Using temporary; Using filesort" whereas now database A doesn't. Could this be due to the long keys or something similar?

database A

+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+------------------------------+------+-------------+
| id | select_type | table | type   | possible_keys                                                                                                                          | key                              | key_len | ref                          | rows | Extra       |
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+------------------------------+------+-------------+
|  1 | SIMPLE      | ps    | index  | PRIMARY,IX_ETA_DATE                                                                                                                    | PRIMARY                          | 4       | NULL                         |  168 | Using where | 
|  1 | SIMPLE      | ev0   | ref    | IX_EVENT_ID_EVENT_TYPE                                                                                                                 | IX_EVENT_ID_EVENT_TYPE           | 36      | UNIVIS_PROD.ps.ship_id,const |    1 |             | 
|  1 | SIMPLE      | ev1   | ref    | IX_EVENT_ID_EVENT_TYPE                                                                                                                 | IX_EVENT_ID_EVENT_TYPE           | 36      | UNIVIS_PROD.ps.ship_id,const |    1 |             | 
|  1 | SIMPLE      | ev2   | ref    | IX_EVENT_ID_EVENT_TYPE                                                                                                                 | IX_EVENT_ID_EVENT_TYPE           | 36      | UNIVIS_PROD.ps.ship_id,const |    1 |             | 
|  1 | SIMPLE      | ev3   | ref    | IX_EVENT_ID_EVENT_TYPE                                                                                                                 | IX_EVENT_ID_EVENT_TYPE           | 36      | UNIVIS_PROD.ps.ship_id,const |    1 |             | 
|  1 | SIMPLE      | ev4   | ref    | IX_EVENT_ID_EVENT_TYPE                                                                                                                 | IX_EVENT_ID_EVENT_TYPE           | 36      | UNIVIS_PROD.ps.ship_id,const |    1 |             | 
|  1 | SIMPLE      | ev5   | ref    | IX_EVENT_ID_EVENT_TYPE                                                                                                                 | IX_EVENT_ID_EVENT_TYPE           | 36      | UNIVIS_PROD.ps.ship_id,const |    1 |             | 
|  1 | SIMPLE      | psi   | ref    | PRIMARY,IX_po_shipment_item_po_shipment1,FK_po_shipment_item_po_shipment1                                                              | IX_po_shipment_item_po_shipment1 | 4       | UNIVIS_PROD.ps.ship_id       |    1 |             | 
|  1 | SIMPLE      | pa    | ref    | IX_po_alloc_po_shipment_item2,IX_po_alloc_po_details_old,FK_po_alloc_po_shipment1,FK_po_alloc_po_shipment_item1,FK_po_alloc_po_header1 | FK_po_alloc_po_shipment_item1    | 8       | UNIVIS_PROD.psi.UID_items    |    6 | Using where | 
|  1 | SIMPLE      | ph    | eq_ref | PRIMARY,IX_HDR_ID                                                                                                                      | PRIMARY                          | 4       | UNIVIS_PROD.pa.hdr_id        |    1 |             | 
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+------------------------------+------+-------------+

database B

+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+-------------------------------+------+----------------------------------------------+
| id | select_type | table | type   | possible_keys                                                                                                                          | key                              | key_len | ref                           | rows | Extra                                        |
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+-------------------------------+------+----------------------------------------------+
|  1 | SIMPLE      | ps    | range  | PRIMARY,IX_ETA_DATE                                                                                                                    | IX_ETA_DATE                      | 4       | NULL                          |   38 | Using where; Using temporary; Using filesort | 
|  1 | SIMPLE      | ev0   | ref    | IX_EVENT_ID_EVENT_TYPE                                                                                                                 | IX_EVENT_ID_EVENT_TYPE           | 36      | UNIVIS_DEV01.ps.ship_id,const |    1 |                                              | 
|  1 | SIMPLE      | ev1   | ref    | IX_EVENT_ID_EVENT_TYPE                                                                                                                 | IX_EVENT_ID_EVENT_TYPE           | 36      | UNIVIS_DEV01.ps.ship_id,const |    1 |                                              | 
|  1 | SIMPLE      | ev2   | ref    | IX_EVENT_ID_EVENT_TYPE                                                                                                                 | IX_EVENT_ID_EVENT_TYPE           | 36      | UNIVIS_DEV01.ps.ship_id,const |    1 |                                              | 
|  1 | SIMPLE      | ev3   | ref    | IX_EVENT_ID_EVENT_TYPE                                                                                                                 | IX_EVENT_ID_EVENT_TYPE           | 36      | UNIVIS_DEV01.ps.ship_id,const |    1 |                                              | 
|  1 | SIMPLE      | ev4   | ref    | IX_EVENT_ID_EVENT_TYPE                                                                                                                 | IX_EVENT_ID_EVENT_TYPE           | 36      | UNIVIS_DEV01.ps.ship_id,const |    1 |                                              | 
|  1 | SIMPLE      | ev5   | ref    | IX_EVENT_ID_EVENT_TYPE                                                                                                                 | IX_EVENT_ID_EVENT_TYPE           | 36      | UNIVIS_DEV01.ps.ship_id,const |    1 |                                              | 
|  1 | SIMPLE      | psi   | ref    | PRIMARY,IX_po_shipment_item_po_shipment1,FK_po_shipment_item_po_shipment1                                                              | IX_po_shipment_item_po_shipment1 | 4       | UNIVIS_DEV01.ps.ship_id       |    1 |                                              | 
|  1 | SIMPLE      | pa    | ref    | IX_po_alloc_po_shipment_item2,IX_po_alloc_po_details_old,FK_po_alloc_po_shipment1,FK_po_alloc_po_shipment_item1,FK_po_alloc_po_header1 | IX_po_alloc_po_shipment_item2    | 4       | UNIVIS_DEV01.ps.ship_id       |    3 | Using where                                  | 
|  1 | SIMPLE      | ph    | eq_ref | PRIMARY,IX_HDR_ID                                                                                                                      | PRIMARY                          | 4       | UNIVIS_DEV01.pa.hdr_id        |    1 |                                              | 
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+-------------------------------+------+----------------------------------------------+

UPDATE It's definitely data related. I dumped the data from database A and loaded it into database B using:

SELECT * from <table> into outfile <file>

and

LOAD DATA INFILE <file> into table <table>

Then the database B query runs quickly - ie. as fast as database A. Any ideas on how to diagnose what could be wrong with the data??

UPDATE @newtover: From database A:

+-----------------+---------------------+
| eta_selectivity | ship_id_selectivity |
+-----------------+---------------------+
|          0.0693 |              1.0000 | 
+-----------------+---------------------+
1 row in set (0.02 sec)

From database B (the bad one)

+-----------------+---------------------+
| eta_selectivity | ship_id_selectivity |
+-----------------+---------------------+
|          0.1814 |              1.0000 | 
+-----------------+---------------------+
1 row in set (0.02 sec)

And the show create for po_shipment:

| po_shipment | CREATE TABLE `po_shipment` (
  `ship_id` int(11) NOT NULL DEFAULT '0',
  `ship_type` varchar(16) DEFAULT NULL,
  `foreign_agent` varchar(16) DEFAULT NULL,
  `agent_ref` varchar(16) DEFAULT NULL,
  `exporter_code` varchar(30) DEFAULT NULL,
  `importer_code` varchar(30) DEFAULT NULL,
  `carrier_code` varchar(30) DEFAULT NULL,
  `exporter_name` varchar(50) DEFAULT NULL,
  `importer_name` varchar(50) DEFAULT NULL,
  `carrier_name` varchar(50) DEFAULT NULL,
  `receipt` varchar(30) DEFAULT NULL,
  `pol_aol` varchar(50) DEFAULT NULL,
  `pod_aod` varchar(30) DEFAULT NULL,
  `final_dest` varchar(50) DEFAULT NULL,
  `vessel_flno` varchar(30) DEFAULT NULL,
  `ets` date DEFAULT NULL,
  `eta` date DEFAULT NULL,
  `pieces` int(11) DEFAULT '0',
  `weight` decimal(17,2) DEFAULT '0.00',
  `volume` decimal(17,2) DEFAULT '0.00',
  `marks` varchar(500) DEFAULT NULL,
  `goods_desc` varchar(500) DEFAULT NULL,
  `ship_terms` varchar(16) DEFAULT NULL,
  `ship_terms_desc` varchar(50) DEFAULT NULL,
  `house_hawb` varchar(30) DEFAULT NULL,
  `ocean_mawb` varchar(30) DEFAULT NULL,
  `booking_date` date DEFAULT NULL,
  `expected_cargo` date DEFAULT NULL,
  `mfrt_jobdisp` varchar(30) DEFAULT NULL,
  `ship_complete` date DEFAULT NULL,
  `user_id` varchar(30) DEFAULT NULL,
  `receipt_desc` varchar(60) DEFAULT NULL,
  `fin_dest_desc` varchar(60) DEFAULT NULL,
  `pol_aol_desc` varchar(60) DEFAULT NULL,
  `pod_aod_desc` varchar(60) DEFAULT NULL,
  `exporter_ref` varchar(26) DEFAULT NULL,
  `carrier_ref` varchar(26) DEFAULT NULL,
  `terms_conds` date DEFAULT NULL,
  `last_amended` date DEFAULT NULL,
  `user_amended` varchar(30) DEFAULT NULL,
  `package_type` varchar(24) DEFAULT NULL,
  `ext_cancelled` tinyint(1) NOT NULL DEFAULT '0',
  `ext_goh` tinyint(1) NOT NULL DEFAULT '0',
  `ext_arrival_date` date DEFAULT NULL,
  `ext_booking_ref` varchar(255) DEFAULT NULL,
  `ext_dc_booked_delivery_date` date DEFAULT NULL,
  `ext_dc_booked_delivery_time` varchar(10) DEFAULT NULL,
  `ext_comments` text,
  `deleted` tinyint(1) NOT NULL DEFAULT '0',
  `last_amended_time` int(10) DEFAULT NULL,
  `last_amended_uni` varchar(30) DEFAULT NULL,
  PRIMARY KEY (`ship_id`),
  KEY `IX_ETA_DATE` (`eta`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 | 

UPDATE @chris_I If I strip the query down by removing all other joins aside from EVENT_TABLE I get the same performance (ie. crappy)

SELECT * FROM po_shipment ps 
LEFT JOIN EVENT_TABLE ev0 ON ev0.TABLE_ID1 = ps.ship_id AND ev0.EVENT_TYPE = 'MAS0' 
LEFT JOIN EVENT_TABLE ev1 ON ev1.TABLE_ID1 = ps.ship_id AND ev1.EVENT_TYPE = 'MAS1' 
LEFT JOIN EVENT_TABLE ev2 ON ev2.TABLE_ID1 = ps.ship_id AND ev2.EVENT_TYPE = 'MAS2' 
LEFT JOIN EVENT_TABLE ev3 ON ev3.TABLE_ID1 = ps.ship_id AND ev3.EVENT_TYPE = 'MAS3' 
LEFT JOIN EVENT_TABLE ev4 ON ev4.TABLE_ID1 = ps.ship_id AND ev4.EVENT_TYPE = 'MAS4' 
LEFT JOIN EVENT_TABLE ev5 ON ev5.TABLE_ID1 = ps.ship_id AND ev5.EVENT_TYPE = 'MAS5' 
WHERE ps.eta >= '2010-03-22'
GROUP BY ps.ship_id
LIMIT 100;

UPDATE @Marcus Adams: Query for plans you have asked for with inner joins removed:

SELECT * FROM po_shipment ps 
LEFT JOIN EVENT_TABLE ev0 ON ev0.TABLE_ID1 = ps.ship_id AND ev0.EVENT_TYPE = 'MAS0' 
LEFT JOIN EVENT_TABLE ev1 ON ev1.TABLE_ID1 = ps.ship_id AND ev1.EVENT_TYPE = 'MAS1' 
LEFT JOIN EVENT_TABLE ev2 ON ev2.TABLE_ID1 = ps.ship_id AND ev2.EVENT_TYPE = 'MAS2' 
LEFT JOIN EVENT_TABLE ev3 ON ev3.TABLE_ID1 = ps.ship_id AND ev3.EVENT_TYPE = 'MAS3' 
LEFT JOIN EVENT_TABLE ev4 ON ev4.TABLE_ID1 = ps.ship_id AND ev4.EVENT_TYPE = 'MAS4' 
LEFT JOIN EVENT_TABLE ev5 ON ev5.TABLE_ID1 = ps.ship_id AND ev5.EVENT_TYPE = 'MAS5' 
WHERE ps.eta >= '2010-03-22'
GROUP BY ps.ship_id
LIMIT 100;

Query Plan from database A (responds in 0.35s)

+----+-------------+-------+-------+------------------------+------------------------+---------+------------------------------+------+----------------------------------------------+
| id | select_type | table | type  | possible_keys          | key                    | key_len | ref                          | rows | Extra                                        |
+----+-------------+-------+-------+------------------------+------------------------+---------+------------------------------+------+----------------------------------------------+
|  1 | SIMPLE      | ps    | range | IX_ETA_DATE            | IX_ETA_DATE            | 4       | NULL                         |  174 | Using where; Using temporary; Using filesort | 
|  1 | SIMPLE      | ev0   | ref   | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36      | UNIVIS_PROD.ps.ship_id,const |    1 |                                              | 
|  1 | SIMPLE      | ev1   | ref   | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36      | UNIVIS_PROD.ps.ship_id,const |    1 |                                              | 
|  1 | SIMPLE      | ev2   | ref   | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36      | UNIVIS_PROD.ps.ship_id,const |    1 |                                              | 
|  1 | SIMPLE      | ev3   | ref   | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36      | UNIVIS_PROD.ps.ship_id,const |    1 |                                              | 
|  1 | SIMPLE      | ev4   | ref   | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36      | UNIVIS_PROD.ps.ship_id,const |    1 |                                              | 
|  1 | SIMPLE      | ev5   | ref   | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36      | UNIVIS_PROD.ps.ship_id,const |    1 |                                              | 
+----+-------------+-------+-------+------------------------+------------------------+---------+------------------------------+------+----------------------------------------------

Query Plan from database B (doesn't respond in time it takes to make a cup of tea)

    +----+-------------+-------+-------+------------------------+------------------------+---------+-------------------------------+------+----------------------------------------------+
| id | select_type | table | type  | possible_keys          | key                    | key_len | ref                           | rows | Extra                                        |
+----+-------------+-------+-------+------------------------+------------------------+---------+-------------------------------+------+----------------------------------------------+
|  1 | SIMPLE      | ps    | range | IX_ETA_DATE            | IX_ETA_DATE            | 4       | NULL                          |   38 | Using where; Using temporary; Using filesort | 
|  1 | SIMPLE      | ev0   | ref   | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36    
+1  A: 

Update your statistics. one time I have a similar issue, and this works for me.

MaK
If you mean ANALYZE TABLE then I've had that before too and I've done that for all of the tables involved with the query on both databases but no change.
Damo
+2  A: 

Haven't touched MySQL for a while but I am guessing the problem has to do with either

  1. Have you checked you key/join field lengths (actual data) maybe causing the sort_buffer to page to disk joining on large (size) keys? (this smells like a data issue...)
  2. The server settings, basically it is writting to an in memory temp table. I had something similar a few years back. Have you enlarged the key_buffer_size, table_cache, read_rnd_buffer_size, sort_buffer, read_buffer_size to see if that helps?
AJ
+1 - you got me thinking about this. Updated question with some interesting info on the keys
Damo
smells like data issue to me too. Tried reducing the size of the keys and also increasing buffers but nothing has worked so far.
Damo
+1  A: 

Since it's InnoDB it looks like it's uite possibly a locking problem. What else is going on at the same time?

le dorfier
Nothing at all. Both databases have been pulled back to my dev machine.
Damo
+4  A: 

Try adding STRAIGHT_JOIN to the query to see if the execution plan is the issue. The optimizer is choosing a different execution plan for each database and this might be causing the issue.

SELECT STRAIGHT_JOIN * FROM po_shipment ps 
LEFT JOIN EVENT_TABLE ev0 ON ev0.TABLE_ID1 = ps.ship_id AND ev0.EVENT_TYPE = 'MAS0' 
LEFT JOIN EVENT_TABLE ev1 ON ev1.TABLE_ID1 = ps.ship_id AND ev1.EVENT_TYPE = 'MAS1' 
LEFT JOIN EVENT_TABLE ev2 ON ev2.TABLE_ID1 = ps.ship_id AND ev2.EVENT_TYPE = 'MAS2' 
LEFT JOIN EVENT_TABLE ev3 ON ev3.TABLE_ID1 = ps.ship_id AND ev3.EVENT_TYPE = 'MAS3' 
LEFT JOIN EVENT_TABLE ev4 ON ev4.TABLE_ID1 = ps.ship_id AND ev4.EVENT_TYPE = 'MAS4' 
LEFT JOIN EVENT_TABLE ev5 ON ev5.TABLE_ID1 = ps.ship_id AND ev5.EVENT_TYPE = 'MAS5' 
JOIN po_shipment_item psi USING (ship_id) 
JOIN po_alloc pa ON ps.ship_id = pa.ship_id AND pa.UID_items = psi.UID_items 
JOIN po_header ph ON pa.hdr_id = ph.hdr_id
WHERE ps.eta >= '2010-03-22'
GROUP BY ps.ship_id
LIMIT 100;

UPDATE
1. The new execution plan for database A doesn't need a filesort or temporary table because it's using the primary key. I would start adding USE INDEX into the query to see if you can speed up the query on database B. Is ship_id the primary key of po_shipment? if so, you need to figure out what costs more the group by or filtering of the dates.

SELECT STRAIGHT_JOIN * FROM po_shipment ps USE INDEX( PRIMARY )
LEFT JOIN EVENT_TABLE ev0 ON ev0.TABLE_ID1 = ps.ship_id AND ev0.EVENT_TYPE = 'MAS0' 
LEFT JOIN EVENT_TABLE ev1 ON ev1.TABLE_ID1 = ps.ship_id AND ev1.EVENT_TYPE = 'MAS1' 
LEFT JOIN EVENT_TABLE ev2 ON ev2.TABLE_ID1 = ps.ship_id AND ev2.EVENT_TYPE = 'MAS2' 
LEFT JOIN EVENT_TABLE ev3 ON ev3.TABLE_ID1 = ps.ship_id AND ev3.EVENT_TYPE = 'MAS3' 
LEFT JOIN EVENT_TABLE ev4 ON ev4.TABLE_ID1 = ps.ship_id AND ev4.EVENT_TYPE = 'MAS4' 
LEFT JOIN EVENT_TABLE ev5 ON ev5.TABLE_ID1 = ps.ship_id AND ev5.EVENT_TYPE = 'MAS5' 
JOIN po_shipment_item psi USING (ship_id) 
JOIN po_alloc pa ON ps.ship_id = pa.ship_id AND pa.UID_items = psi.UID_items 
JOIN po_header ph ON pa.hdr_id = ph.hdr_id
WHERE ps.eta >= '2010-03-22'
GROUP BY ps.ship_id
LIMIT 100;

If that doesn't help try suggesting more of the indexes used in the execution plan of database A for database B.

mholzmann
Thanks, interesting option. Hasn't changed the end result. Database B is still hanging. The query plans have changed though - both still different. I've updated the question.
Damo
The reason it dropped the filesort and temporary table on database A is because it's using the primary key for ps. Try adding USE INDEX, i'll add queries in a minute.
mholzmann
+2  A: 

If it's a data problem, I can't tell you what the exact problem is, but here's my favorite strategy to solve this kind of problem:

Try to remove half of your joins. Repeat recursively, until the query runs fast. Then add half of the joins you removed in the last step... (This strategy will require much fewer steps than removing and adding join by join.)

Once you've found out the "bad" join, you can try restricting its values with an additional "where" clause, until the query runs fast again... In each step, always try to reduce the problem in half.

Note: It can very well be the case, that you get a lot more records for the intermediate results of your joins, even if the total amount of data is smaller in database B.

Chris Lercher
I've done this. See the "Points to Note" in my question. Removing joins on EVENT_TABLE cause it to speed up with each one removed.
Damo
@Damo: If *every* additional EVENT_TABLE join causes additional time, then the main problem is most likely located in one of the other joins! The EVENT_TABLEs then just multiply the time it takes.
Chris Lercher
@Chris_I I've updated the question. I can remove all of the other joins leaving only the EVENT_TABLEs and I get the same effect.
Damo
@Damo: How large is po_shipment itself (in database A and B)?
Chris Lercher
@Damo: I see, you already answered this: 1776 vs 463. That's not a very big difference. Maybe it's still enough to exhaust some temporary space, if the rows are quite large? I've seen (too) much of this kind of behavior with MaxDB - way fewer problems with Postgres by the way.
Chris Lercher
@Chris_I the rows aren't particularly large as you can see from the SHOW CREATE TABLE. I can't help thinking that this is something that can be fixed by a variable change to allow for the obviously wonky data.
Damo
@Damo: I personally would still try deleting *half* of the data in (a copy of) po_shipment etc. This kind of exponential zooming in to the problem has solved basically every data related database problem for me so far, because it always takes only a few steps to find the responsible rows (e.g. even for 4 billion rows, it takes only about 32 steps). HTH.
Chris Lercher
+1  A: 

Since it seems like some issue with the data, it would probably be helpful to figure out what data is causing the problem. Make a third database C and insert half of the data from database B twice(so you have the same number of rows). If database C is slower, than the bad data is there, otherwise it is in the other half. Repeat with smaller and smaller chunk sizes to help find the problem data.

Even though database B is smaller that database A, the tables 'po_header' and 'EVENT_TABLE' are not proportionally smaller. That may have something to do with the speed difference.

Echo
+1 Good approach
Damo
A: 

I think you're going at this the wrong way. When you do a LEFT JOIN, you're getting back all of the records in EVENT_TABLE regardless of a match to po_shipment.

And you've run a LEFT JOIN to EVENT_TABLE six times. You're getting back (6 * (SELECT count(*) FROM EVENT_TABLE) ) records with every query. Since db A has less records than db B, of course the query runs faster on A.

I think something like this would perform worlds better:

DECLARE @TEMP_EVENT_TYPES table 
(
    EVENT_TYPE varchar(10) PRIMARY KEY
)
;
/*
INSERT VALUES 'MAS0', 'MAS1', 'MAS2', 'MAS3', 'MAS4', 'MAS5'
*/

;
SELECT * FROM po_shipment ps 
JOIN po_shipment_item psi USING (ship_id) 
JOIN po_alloc pa ON ps.ship_id = pa.ship_id AND pa.UID_items = psi.UID_items 
JOIN po_header ph ON pa.hdr_id = ph.hdr_id 
INNER JOIN 
    EVENT_TABLE et ON et.TABLE_ID1 = ps.ship_id
INNER JOIN 
    @TEMP_EVENT_TYPES tet ON tet.event_type = et.EVENT_TYPE
WHERE ps.eta >= '2010-03-22'
GROUP BY ps.ship_id
iKnowKungFoo
Wouldn't that only be true when LEFT JOINing from EVENT_TABLE not to EVENT_TABLE. The way he is doing it, LEFT JOINing to EVENT_TABLE, just means that if the join fails then he still gets a result for that row in po_shipment.
Echo
@Echo: exactly. I still need the row even if it doesn't have that EVENT_TYPE.
Damo
+1  A: 

Have to tried doing "CHECK TABLE" and/or "OPTIMIZE TABLE"?

I had a similar situation where the problem was that the key indexes where not updated properly and doing CHECK TABLE fixed them. A long shot, but worth the try.

Nir Levy
Thanks. Ran both CHECK TABLE and OPTIMIZE on all tables involved in the query. No effect.
Damo
+1  A: 

To begin with make sure that you have created indexes on the appropriate fields. I am sure you've done that.

Next, try using index hints (USE INDEX) to force the database to use the indexes properly.

I had a similar problem in which I assumed that the indexes were setup and used by mysql properly, but they were not. I was able to work around it using index hints.

Salman A
+1  A: 

What is the selectivity of data in po_shipment.eta and po_shipment.ship_id. Could you post the results of the following query from both databases:

SELECT
    count(distinct eta)/count(*) as eta_selectivity,
    count(distinct ship_id)/count(*) as ship_id_selectivity
FROM po_shipment;

Usually the more selective are the field data (closer to 1), the better index is working. The cause might also be very uneven distribution of data in po_shipment.eta (when you use '2099-01-01' or the like for as a special value) if the optimizer lacks necessary statistics.

By the way, could you provide SHOW CREATE TABLE po_shipment? Indices on the table could shed some light.

UPD: When the selectivity on the field is so low as for eta field, the index is basically useless. Even worse, it might confuse the optimizer on selects and slow down inserts and updates of the data.

The first suggestion is to try removing the index on the eta field and measure the results. It might be that for the A database optimizer does not even try to use the index because of the the very low selectivity and for the B database it gives the index a try.

The second thing that bothers me is why do you group by ship_id in the first place? When there is a need in an implicit temporary table and among the fields there is a TEXT (as in your case) or BLOB one, MySQL will always use on disk temporary tables for sorting (which is implicitly required in GROUP BY). In your case, ship_id is the clustered primary key and the results will anyway be sorted by ship_id. What you need is to extract a base query (already grouped by ship_id if there can be several correscpondences) and JOIN po_shipment with the base query applying your range condition and not using the group by at the top.

And the third. Do you realy need all the fields when you use * at the top? Having joined 10 tables, you receive a lot of fieds. I hardly believe you need al of them. Even exclusion of the TEXT field from the results might boost the query performance.

newtover
+1 interesting info. I've added the data requested.
Damo
It's a stripped down query to illustrate the problem. The "real" query is actually a Hibernate query with many more joins and a set of specific fields rather than *. Having said that I can probably drop the GROUP BY. I'll check the results.
Damo
+1  A: 

I've experienced the same when using range selectors less than < and greater than > too.

Experiment: If the range is not too big, have you tried tried to expand the range into and an IN (...,...,...) statement instad ?

E.g

SELECT * FROM po_shipment ps USE INDEX (IX_ETA_DATE)
JOIN po_shipment_item psi USING (ship_id) 
JOIN po_alloc pa ON ps.ship_id = pa.ship_id AND pa.UID_items = psi.UID_items 
JOIN po_header ph ON pa.hdr_id = ph.hdr_id 
LEFT JOIN EVENT_TABLE ev0 ON ev0.TABLE_ID1 = ps.ship_id AND ev0.EVENT_TYPE = 'MAS0' 
LEFT JOIN EVENT_TABLE ev1 ON ev1.TABLE_ID1 = ps.ship_id AND ev1.EVENT_TYPE = 'MAS1' 
LEFT JOIN EVENT_TABLE ev2 ON ev2.TABLE_ID1 = ps.ship_id AND ev2.EVENT_TYPE = 'MAS2' 
LEFT JOIN EVENT_TABLE ev3 ON ev3.TABLE_ID1 = ps.ship_id AND ev3.EVENT_TYPE = 'MAS3' 
LEFT JOIN EVENT_TABLE ev4 ON ev4.TABLE_ID1 = ps.ship_id AND ev4.EVENT_TYPE = 'MAS4' 
LEFT JOIN EVENT_TABLE ev5 ON ev5.TABLE_ID1 = ps.ship_id AND ev5.EVENT_TYPE = 'MAS5' 
WHERE ps.eta IN ('2010-03-22','2010-03-21','2010-03-20',...)
GROUP BY ps.ship_id
LIMIT 100;

Edit: Added USE INDEX() hint as Salman A suggested. Seems like mysql see's the possible index, but choses not to use it... worth testing.

PHP_Jedi
If I add an IN like this then it responds very quickly. However if I tighten the range condition it responds even quicker. For example ps.eta IN ('2010-03-19', '2010-03-20', '2010-03-21') responds in 0.14s whereas ps.eta >= '2010-03-19' and ps.eta <= '2010-03-21' responds in 0.06s. Unfortunately I need a bigger date range.
Damo
+1  A: 

I think it may be the manual sort (filesort) for the GROUP BY clause that is causing the noticeable deficit.

Try using the SQL_BIG_RESULT hint to see if MySQL will improve it's method for handling the GROUP BY.

SELECT SQL_BIG_RESULT * FROM ...
Marcus Adams
This is very interesting. Previously when I look at the state of the connection it was always "Copying to tmp table" now with this hint it is "Sending Data". No increase in performance though. Just stays at that state for ages.
Damo
@Damo - Sending data - doesn't that mean that the query is done? How much data would the result approx. consist of. 100KB, 1MB more ? What net connection do you use, or is it localhost? Just thought since you selects everything (*) from alot of tables. E.x export a slow result to .TXT or .CSV and check filesize.
PHP_Jedi
It's localhost and no that much data (limit 100). No reason why it should be sending the data for so long.
Damo
A: 

It sure looks like you're modeling a state machine for po_shipment w/ event records in event_table for each state transition.

What is the business logic you're trying to express with this query?

You should be able to add a state field to your po_shipment to accomplish this, which summarizes and denormalizes the series of events in the event_table.

Building the logic to verify all 6 events have taken place for a given record shouldn't be in the database, it should be in your model and persisted in the database (classic state machine pattern for a model).

Then it's just select from po_shipment where state = 'MAS5' and eta > date;

Winfield
Certainly not a state machine. It's just a Shipment with some configurable Events which are done in sequence. In the UI you see the Shipment data and then a history of the Events.
Damo
If you can't change the structure of the query completely, I bet you can boil this down to a single join on the event_table. Join once with the condition:... event_type IN ('MAS0', 'MAS1', ... 'MAS5')
Winfield
A: 

Out of curiosity - does the original data in B have many NULLs?

Both databases have lots of NULLs. Especially in the joined EVENT_TABLEs.
Damo