views:

85

answers:

3

I need to compare integers in a mysql table. Pretty simple, but this table is fairly large... so queries take a long time. No problem, I can use an index. According to MySQL documentation, I should be able to use an index for comparison operators: "A B-tree index can be used for column comparisons in expressions that use the =, >, >=, <, <=, or BETWEEN"

However, when I try this it has no effect on performance and the index is not used according to explain :(

SELECT * FROM Node n WHERE n.X < 800000

That results in extremely poor performance and calling explain shows our "Rectangle_Index" as being of the possible_keys but NULL key was actually used... Here's are create table statement:

CREATE TABLE `Visual_Node` (
  `Id` bigint(20) NOT NULL AUTO_INCREMENT,
  `X` bigint(20) NOT NULL,
  `Y` bigint(20) NOT NULL,
  `X_plus_Width` bigint(20) DEFAULT NULL,
  `Y_plus_Height` bigint(20) DEFAULT NULL,
  PRIMARY KEY (`Id`),
  KEY `Rectangle_Index` (`X`,`X_plus_Width`,`Y`,`Y_plus_Height`)
) ENGINE=InnoDB AUTO_INCREMENT=4340743 DEFAULT CHARSET=latin1

Can anyone help this query? The actual query I want to run is the following:

SELECT * FROM Node n WHERE 800000 BETWEEN n.X and n.X_plus_Width AND 1234567 BETWEEN n.Y and n.Y_plus_Height

Update (asked in one of the answers below) Heres the output of the explain for the basic query: altering the table structure is very difficult for me. Here's the output of my explain:

mysql> explain select * from Node n where n.X < 800000;     
+----+-------------+-------+------+-----------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys   | key  | key_len | ref  | rows   | Extra       |
+----+-------------+-------+------+-----------------+------+---------+------+--------+-------------+
|  1 | SIMPLE      | n     | ALL  | Rectangle_Index | NULL | NULL    | NULL | 173952 | Using where | 
+----+-------------+-------+------+-----------------+------+---------+------+--------+-------------+
1 row in set (0.02 sec)
A: 

Have you tried changing the index to:

CREATE TABLE `Visual_Node` (
  `Id` bigint(20) NOT NULL AUTO_INCREMENT,
  `X` bigint(20) NOT NULL,
  `Y` bigint(20) NOT NULL,
  `X_plus_Width` bigint(20) DEFAULT NULL,
  `Y_plus_Height` bigint(20) DEFAULT NULL,
  PRIMARY KEY (`Id`),
  KEY `X_Index` (`X`),
  KEY `Y_Index` (`Y`),
  KEY `X_Width_Index` (`X_plus_Width`),
  KEY `Y_Height_Index` (`Y_plus_Height`)
) ENGINE=InnoDB AUTO_INCREMENT=4340743 DEFAULT CHARSET=latin1

Judging by your AI value, you'll probably want to test this with a smaller set of data.

Cez
How would making the keys all separate help?
at
@at Because then you are free to use them as you wish in your query and you will hit the index. My previous revision had split the single compound index into 2, which should also improve your query performance, but may be less flexible
Cez
+2  A: 

Have you checked the details of multiple-column indexes - specifically, the part about how the optimizer is (or is not) able to use them. Here's a quote from this page:

If the table has a multiple-column index, any leftmost prefix of the index can be used by the optimizer to find rows. For example, if you have a three-column index on (col1, col2, col3), you have indexed search capabilities on (col1), (col1, col2), and (col1, col2, col3).

Perhaps you could try creating multiple single-column indexes, rather than one multiple-column index?

EDIT 1:

I put together a simple test on my copy of MySQL (version 5.0.51a-24+lenny3). It shows that when using both your proper query, and your test query, your Rectangle_Index is being used. However, when using the proper query, the key_len is 8, suggesting that not all the parts of the multi-column index are being used. Perhaps the output from your version of MySQL differs in this respect.

As you'll see from the output below, even when additional indexes are added, the Rectangle_Index index is still chosen in all cases, except only the Y column is referenced in the query:

CREATE TABLE `Visual_Node` (
    `Id` bigint(20) NOT NULL AUTO_INCREMENT,
    `X` bigint(20) NOT NULL,
    `Y` bigint(20) NOT NULL,
    `X_plus_Width` bigint(20) DEFAULT NULL,
    `Y_plus_Height` bigint(20) DEFAULT NULL,
    PRIMARY KEY (`Id`),
    KEY `Rectangle_Index` (`X`,`X_plus_Width`,`Y`,`Y_plus_Height`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1;

INSERT INTO `Visual_Node` VALUES
      (1,   100000,  1000000,  3000000,  3000000),
      (2,   200000,  2000000,  4000000,  4000000),
      (3,   300000,  3000000,  5000000,  5000000),
      (4,   400000,  4000000,  6000000,  6000000),
      (5,   500000,  5000000,  7000000,  7000000),
      (6,   600000,  6000000,  8000000,  8000000),
      (7,   700000,  7000000,  9000000,  9000000),
      (8,   800000,  8000000, 10000000, 10000000),
      (9,   900000,  9000000, 11000000, 11000000),
      (10, 1000000, 10000000, 12000000, 12000000);

EXPLAIN SELECT * FROM Visual_Node n WHERE n.X < 800000;
+----+-------------+-------+-------+-----------------+-----------------+---------+------+------+--------------------------+
| id | select_type | table | type  | possible_keys   | key             | key_len | ref  | rows | Extra                    |
+----+-------------+-------+-------+-----------------+-----------------+---------+------+------+--------------------------+
|  1 | SIMPLE      | n     | range | Rectangle_Index | Rectangle_Index | 8       | NULL |    5 | Using where; Using index |
+----+-------------+-------+-------+-----------------+-----------------+---------+------+------+--------------------------+

EXPLAIN SELECT * FROM Visual_Node n WHERE n.Y < 800000;
+----+-------------+-------+-------+---------------+-----------------+---------+------+------+--------------------------+
| id | select_type | table | type  | possible_keys | key             | key_len | ref  | rows | Extra                    |
+----+-------------+-------+-------+---------------+-----------------+---------+------+------+--------------------------+
|  1 | SIMPLE      | n     | index | NULL          | Rectangle_Index | 34      | NULL |   10 | Using where; Using index |
+----+-------------+-------+-------+---------------+-----------------+---------+------+------+--------------------------+

EXPLAIN SELECT * FROM Visual_Node n
      WHERE 800000 BETWEEN n.X and n.X_plus_Width
      AND 1234567 BETWEEN n.Y and n.Y_plus_Height;
+----+-------------+-------+-------+-----------------+-----------------+---------+------+------+--------------------------+
| id | select_type | table | type  | possible_keys   | key             | key_len | ref  | rows | Extra                    |
+----+-------------+-------+-------+-----------------+-----------------+---------+------+------+--------------------------+
|  1 | SIMPLE      | n     | range | Rectangle_Index | Rectangle_Index | 8       | NULL |    5 | Using where; Using index |
+----+-------------+-------+-------+-----------------+-----------------+---------+------+------+--------------------------+

ALTER TABLE `Visual_Node` ADD INDEX `X_Index` (`X`,`X_plus_Width`);
ALTER TABLE `Visual_Node` ADD INDEX `Y_Index` (`Y`,`Y_plus_Height`);

EXPLAIN SELECT * FROM Visual_Node n WHERE n.X < 800000;
+----+-------------+-------+-------+-------------------------+-----------------+---------+------+------+--------------------------+
| id | select_type | table | type  | possible_keys           | key             | key_len | ref  | rows | Extra                    |
+----+-------------+-------+-------+-------------------------+-----------------+---------+------+------+--------------------------+
|  1 | SIMPLE      | n     | range | Rectangle_Index,X_Index | Rectangle_Index | 8       | NULL |    5 | Using where; Using index |
+----+-------------+-------+-------+-------------------------+-----------------+---------+------+------+--------------------------+

EXPLAIN SELECT * FROM Visual_Node n WHERE n.Y < 800000;
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
|  1 | SIMPLE      | n     | range | Y_Index       | Y_Index | 8       | NULL |    1 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+

EXPLAIN SELECT * FROM Visual_Node n
      WHERE 800000 BETWEEN n.X and n.X_plus_Width
      AND 1234567 BETWEEN n.Y and n.Y_plus_Height;
+----+-------------+-------+-------+---------------------------------+-----------------+---------+------+------+--------------------------+
| id | select_type | table | type  | possible_keys                   | key             | key_len | ref  | rows | Extra                    |
+----+-------------+-------+-------+---------------------------------+-----------------+---------+------+------+--------------------------+
|  1 | SIMPLE      | n     | range | Rectangle_Index,X_Index,Y_Index | Rectangle_Index | 8       | NULL |    5 | Using where; Using index |
+----+-------------+-------+-------+---------------------------------+-----------------+---------+------+------+--------------------------+

ALTER TABLE `Visual_Node` ADD INDEX `X` (`X`,`X_plus_Width`);    
ALTER TABLE `Visual_Node` ADD INDEX `X_plus_Width` (`X_plus_Width`);    
ALTER TABLE `Visual_Node` ADD INDEX `Y` (`Y`);
ALTER TABLE `Visual_Node` ADD INDEX `Y_plus_Height` (`Y_plus_Height`);

EXPLAIN SELECT * FROM Visual_Node n WHERE n.X < 800000;
+----+-------------+-------+-------+---------------------------+-----------------+---------+------+------+--------------------------+
| id | select_type | table | type  | possible_keys             | key             | key_len | ref  | rows | Extra                    |
+----+-------------+-------+-------+---------------------------+-----------------+---------+------+------+--------------------------+
|  1 | SIMPLE      | n     | range | Rectangle_Index,X_Index,X | Rectangle_Index | 8       | NULL |    5 | Using where; Using index |
+----+-------------+-------+-------+---------------------------+-----------------+---------+------+------+--------------------------+

EXPLAIN SELECT * FROM Visual_Node n WHERE n.Y < 800000;
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
|  1 | SIMPLE      | n     | range | Y_Index,Y     | Y_Index | 8       | NULL |    1 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+

EXPLAIN SELECT * FROM Visual_Node n
    WHERE 800000 BETWEEN n.X and n.X_plus_Width
    AND 1234567 BETWEEN n.Y and n.Y_plus_Height;
+----+-------------+-------+-------+----------------------------------------------------------------+-----------------+---------+------+------+--------------------------+
| id | select_type | table | type  | possible_keys                                                  | key             | key_len | ref  | rows | Extra                    |
+----+-------------+-------+-------+----------------------------------------------------------------+-----------------+---------+------+------+--------------------------+
|  1 | SIMPLE      | n     | range | Rectangle_Index,X_Index,Y_Index,X,X_plus_Width,Y,Y_plus_Height | Rectangle_Index | 8       | NULL |    5 | Using where; Using index |
+----+-------------+-------+-------+----------------------------------------------------------------+-----------------+---------+------+------+--------------------------+

Can you post the output from your EXPLAIN query?

What version of MySQL are you using?

EDIT 2:

The Spatial Extensions, as suggested by Naktibalda, are really cool. I'd not used these before, but if you are able to alter your table structure to use them, they may solve your problem.

Curious, I did a little research, and here's the result of my test scripts:

CREATE TABLE `Spatial_Node` (
  `Id` bigint(20) NOT NULL AUTO_INCREMENT,
  `Rectangle` POLYGON NOT NULL,
  PRIMARY KEY (`Id`),
  SPATIAL KEY `Rectangle` (`Rectangle`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

INSERT INTO `Spatial_Node` (`Rectangle`)
    SELECT Polygon(LineString(
        Point(X,            Y),
        Point(X_plus_Width, Y),
        Point(X_plus_Width, Y_plus_Height),
        Point(X,            Y_plus_Height),
        Point(X,            Y)
))
FROM Visual_Node;

SELECT AsText(`Rectangle`) FROM Spatial_Node
    WHERE MBRContains(Rectangle, Point(100001, 1000001));

+-----------------------------------------------------------------------------------------+
| AsText(`Rectangle`)                                                                     |
+-----------------------------------------------------------------------------------------+
| POLYGON((100000 1000000,3000000 1000000,3000000 3000000,100000 3000000,100000 1000000)) |
+-----------------------------------------------------------------------------------------+

EXPLAIN SELECT AsText(`Rectangle`) FROM Spatial_Node
    WHERE MBRContains(Rectangle, Point(100001, 1000001));

+----+-------------+--------------+-------+---------------+-----------+---------+------+------+-------------+
| id | select_type | table        | type  | possible_keys | key       | key_len | ref  | rows | Extra       |
+----+-------------+--------------+-------+---------------+-----------+---------+------+------+-------------+
|  1 | SIMPLE      | Spatial_Node | range | Rectangle     | Rectangle | 32      | NULL |    1 | Using where |
+----+-------------+--------------+-------+---------------+-----------+---------+------+------+-------------+

I have no idea how the speed will compare, but I've definitely learned something new and exciting today. Thanks Naktibalda :-)

Mike
+1 This explains what I was suggesting (and didn't explain)
Cez
@Cez: Most kind, thank you :-)
Mike
altering the table structure is very difficult for me. I'm adding the output of explain above in the question
at
+2  A: 

If you rewrite your query as

SELECT * 
FROM Node n 
WHERE 
  n.X <= 800000  AND 
  n.X_plus_Width >= 800000  AND 
  n.Y <= 1234567  AND
  n.Y_plus_Height >= 1234567

Mysql could use index for one column (it can't use index for more than 1 range condition, and you have 4 of them.

I suggest you to take a look at Spatial extensions

Naktibalda
+1 For the spatial extensions idea. I'd not seen them before (I've never had cause to use them), but they look really good. I've modified my answer to include a basic usage example.
Mike
+1 For getting to me to look at spatials again to find that they are no longer only MyISAM (5.0.16+)
Cez
I'll take a look at spatial extensions, thanks. Rewriting my query doesn't help as even my simple query doesn't use the index.
at