views:

70

answers:

2

I'm executing the following query

SELECT COUNT(*)
FROM table
WHERE field1='value' AND (field2 >= 1000 OR field3 >= 2000)

There is one index over field1 and another composited over field2&field3.

I see MySQL always selects the field1 index and then makes a join using the other two fields which is quite bad because it needs to join 146.000 rows.

Suggestions on how to improve this? Thanks

(EDIT AFTER TRYING SOLUTION PROPOSED)

Based in the solution proposed I've seen this on Mysql when playing with this.

SELECT COUNT(*) FROM (SELECT * FROM table WHERE columnA = value1 UNION SELECT * FROM table WHERE columnB = value2) AS unionTable;

is a lot slower than execute:

SELECT COUNT(*) FROM table WHERE (columnA = value1 AND columnB = value2) OR (columnA = value1 AND columnC = value3)

Having two composited index:

index1 (columnA,columnB) index2 (columnA,columnC)

Interesting enough is that asking Mysql to "explain" the query it's taking always index1 on both cases and index2 is not used.

If I change the indexes to:

index1 (columnB,columnA) index2 (columnC,columnA)

And the query to:

SELECT COUNT(*) FROM table WHERE (columnB = value2 AND columnA = value1) OR (columnC = value3 AND columnA = value1)

Then it's the fastest way I've found Mysql works.

+1  A: 

The typical way to break up OR predicates is with UNION.

Note that your example doesn't fit well with your indexes. Even if you omitted field1 from the predicate, you'd have field2 >= 1000 OR field3 >= 2000, which can't use an index. If you had indexes on (field1, field2) and (field1,field3) or field2 or field3 separately, you would get a reasonably fast query.

SELECT COUNT(*) FROM
(SELECT * FROM table WHERE field1 = 'value' AND field2 >= 1000
UNION
SELECT * FROM table WHERE field1 = 'value' AND field3 >= 2000) T

Note that you have to provide an alias for the derived table, which is why the subquery is aliased as T.

A real-world example. Column and table names have been anonymized!

mysql> SELECT COUNT(*) FROM table;
+----------+
| COUNT(*) |
+----------+
|  3059139 |
+----------+
1 row in set (0.00 sec)

mysql> SELECT COUNT(*) FROM table WHERE columnA = value1;
+----------+
| COUNT(*) |
+----------+
|     1068 |
+----------+
1 row in set (0.00 sec)

mysql> SELECT COUNT(*) FROM table WHERE columnB = value2;
+----------+
| COUNT(*) |
+----------+
|      947 |
+----------+
1 row in set (0.00 sec)

mysql> SELECT COUNT(*) FROM table WHERE columnA = value1 OR columnB = value2;
+----------+
| COUNT(*) |
+----------+
|     1616 |
+----------+
1 row in set (9.92 sec)

mysql> SELECT COUNT(*) FROM (SELECT * FROM table WHERE columnA = value1
UNION SELECT * FROM table WHERE columnB = value2) T;
+----------+
| COUNT(*) |
+----------+
|     1616 |
+----------+
1 row in set (0.17 sec)

mysql> SELECT COUNT(*) FROM (SELECT * FROM table WHERE columnA = value1
UNION ALL SELECT * FROM table WHERE columnB = value2) T;
+----------+
| COUNT(*) |
+----------+
|     2015 |
+----------+
1 row in set (0.12 sec)
David M
You have a bug: You're counting elements that meet both criteria (where `field2 >= 1000` *and* `field3 >= 2000`) twice. (Hint: use a temporary table ;))
soulmerge
UNION is UNION DISTINCT by default. Duplicate rows are removed as part of the UNION construct. If one really to count them twice, one would use 'UNION ALL'.Did you even try the statement I suggested on some similar table of your own?
David M
+1  A: 

I'm new here, so I can't comment on other people's posts, but this is related to the posts by David M. and soulmerge.

The temporary table is not necessary. The UNION David M. suggested does not double count, as UNION implies a distinct (i.e. if a row exists in one half of the union, ignore it in the other). If you used UNION ALL, you would get two records.

The default behavior for UNION is that duplicate rows are removed from the result. The optional DISTINCT keyword has no effect other than the default because it also specifies duplicate-row removal. With the optional ALL keyword, duplicate-row removal does not occur and the result includes all matching rows from all the SELECT statements.

http://dev.mysql.com/doc/refman/5.0/en/union.html

Brandongk