tags:

views:

88

answers:

5

so, I have tried everything I can think of, and can't get this query to happen in less than 3 seconds on my local server. I know the problem has to do with the OR referencing both the owner_id and the person_id. if I run one or the other it happens instantly, but together with an or I can't seem to make it work - I looked into rewriting the code, but the way the app was designed it won't be easy. is there a way I can call an equivalent or that won't take so long? here is the sql:

SELECT event_types.name as event_type_name,event_types.id as id, count(events.id) as 
count,sum(events.estimated_duration) as time_sum FROM events,event_types 
WHERE event_types.id = events.event_type_id AND events.event_type_id != '4' 
AND ( events.status!='cancelled') 
AND events.event_type_id != 64 
AND ( events.owner_id = 161 OR events.person_id = 161 ) 
GROUP BY event_types.name 
ORDER BY event_types.name DESC;

Here's the Explain soup, although I'm guessing it's unnecessary cause there is probably a better way to structure that or that is obvious:

thanks so much! chris.

+----+-------------+-------------+-------+---------------------------------------------------------------------------------------------------------+-------------------------------+---------+-------------------------------------+------+----------------------------------------------+
| id | select_type | table       | type  | possible_keys                                                                                           | key                           | key_len | ref                                 | rows | Extra                                        |
+----+-------------+-------------+-------+---------------------------------------------------------------------------------------------------------+-------------------------------+--
|  1 | SIMPLE      | event_types | range | PRIMARY                                                                                                 | PRIMARY                       | 4       | NULL                                |   78 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | events      | ref   | index_events_on_status,index_events_on_event_type_id,index_events_on_person_id,index_events_on_owner_id | index_events_on_event_type_id | 5       | thenumber_production.event_types.id |  907 | Using where                                  |
+----+-------------+-------------+-------+---------------------------------------------------------------------------------------------------------+-------------------------------+---------+-------------------------------------+------+----------------------------------------------+
A: 

The query is going to be a problem. The normal solution to factoring out OR conditions (which are never going to be performant) is to use a UNION ALL ie:

SELECT *
FROM a
WHERE field1 = 1 OR field2 = 2

to:

SELECT *
FROM a
WHERE field1 = 1
UNION ALL
SELECT *
FROM a
WHERE field2 = 2

The works so long as you don't get (or don't mind) duplicates. If you have duplicates and need to factor them out you can use a UNION instead but that does an implicit DISTINCT aggregation so will be much less performant.

Also you want to sort the result of that, which again is going to be a problem.

This is one of those cases where your best approach is to change your data model so it lends itself to writing performant queries (ie so you don't need to use an OR at all).

One question: can the owner_id and person_id for an event ever be the same?

cletus
yes, they could conceivably be the same - although it's uncommon. it sound's like it's time to dig deeper, and start refactoring the data model. Out of curiousity, would the union all work with the GROUP BY?
Chris n
UNIONs are just concatenating query results basically so yes you could GROUP BY each piece (or even GROUP BY one and not the other). The columns simply have to match up in number and type.
cletus
A: 

How about using two subqueries, each of which does one of the two "or" paths, and union those subqueries to sum up the totals in a top-level select?

Something like: (and the syntax is not exact here, as I'm going from memory):

select (bname, id, sum(id), sum(time_sum) from ((select ... big query with owner_id) union (select ... big query with person_id))

Jon Watte
A: 

Maybe try with:

   SELECT event_types.name AS event_type_name, event_types.id AS id,
          COUNT(events.id) AS count, SUM(events.estimated_duration) AS time_sum
     FROM events
     JOIN event_types ON event_types.id = event.event_type_id
    WHERE events.event_type_id <> 4
      AND events.status <> 'cancelled'
      AND events.event_type_id <> 64 
      AND ( events.owner_id = 161 OR events.person_id = 161 ) 
 GROUP BY event_types.id 
 ORDER BY event_types.name DESC;

And here are some hints:

  • use <> instead of !=
  • use int values 4 for id's instead of string '4'
hsz
Why is `<>` different to `!=`?
Alex
Just to clarify, <> and != are semantically identical in mysql (and most other flavors of sql) so that will give you *no* performance benefit. However, changing `'4'` to just 4 will improve performance because otherwise mysql has to do a conversion on each comparison and even more importantly, it can't use an index when it has to do the value casting.
Rob Van Dam
A: 

This is what I would try.

I would make sure I have unique keys on both 'owner_id' and 'person_id':

alter table events add unique (person_id, id), add unique (owner_id, id);

You should also regularly run this:

analyze table events;

Give that a try.

Another option would be to try and move the bad part of the query into a join predicate.

SELECT event_types.name as event_type_name,event_types.id as id, count(events.id) as 
count,sum(events.estimated_duration) as time_sum FROM events,event_types
JOIN event_types on event_types.id = events.event_type_id
AND ( events.owner_id = 161 OR events.person_id = 161 )
WHERE events.event_type_id != '4' 
AND ( events.status!='cancelled') 
AND events.event_type_id != 64 
GROUP BY event_types.name 
ORDER BY event_types.name DESC;

Note that I haven't tried any of this, and it is all off the top of my head. But it is what I would try first.

Phil Wallach
A: 

try breaking it up in 2 queries with an UNION to join the resultset, one with events.owner_id = 161 and the other with events.person_id = 161.

the "using filesort, using temporary" is also a very bad sign, you should probably add an index on event_types.names.

ggiroux