views:

116

answers:

3

About the system: - There are tutors who create classes and packs - A tags based search approach is being followed.Tag relations are created when new tutors register and when tutors create packs (this makes tutors and packs searcheable). For details please check the section How tags work in this system? below.

Following is the concerned query

SELECT SUM(DISTINCT( t.tag LIKE "%Dictatorship%" )) AS key_1_total_matches,
       SUM(DISTINCT( t.tag LIKE "%democracy%" ))    AS key_2_total_matches,
       COUNT(DISTINCT( od.id_od ))                  AS tutor_popularity,
       CASE
         WHEN ( IF(( wc.id_wc > 0 ), ( wc.wc_api_status = 1
                                       AND wc.wc_type = 0
                                       AND wc.class_date > '2010-06-01 22:00:56'
                                       AND wccp.status = 1
                                       AND ( wccp.country_code = 'IE'
                                              OR wccp.country_code IN ( 'INT' )
                                           ) ), 0)
              ) THEN 1
         ELSE 0
       END                                          AS 'classes_published',
       CASE
         WHEN ( IF(( lp.id_lp > 0 ), ( lp.id_status = 1
                                       AND lp.published = 1
                                       AND lpcp.status = 1
                                       AND ( lpcp.country_code = 'IE'
                                              OR lpcp.country_code IN ( 'INT' )
                                           ) ), 0)
              ) THEN 1
         ELSE 0
       END                                          AS 'packs_published',
       td . *,
       u . *
FROM   tutor_details AS td
       JOIN users AS u
         ON u.id_user = td.id_user
       LEFT JOIN learning_packs_tag_relations AS lptagrels
         ON td.id_tutor = lptagrels.id_tutor
       LEFT JOIN learning_packs AS lp
         ON lptagrels.id_lp = lp.id_lp
       LEFT JOIN learning_packs_categories AS lpc
         ON lpc.id_lp_cat = lp.id_lp_cat
       LEFT JOIN learning_packs_categories AS lpcp
         ON lpcp.id_lp_cat = lpc.id_parent
       LEFT JOIN learning_pack_content AS lpct
         ON ( lp.id_lp = lpct.id_lp )
       LEFT JOIN webclasses_tag_relations AS wtagrels
         ON td.id_tutor = wtagrels.id_tutor
       LEFT JOIN webclasses AS wc
         ON wtagrels.id_wc = wc.id_wc
       LEFT JOIN learning_packs_categories AS wcc
         ON wcc.id_lp_cat = wc.id_wp_cat
       LEFT JOIN learning_packs_categories AS wccp
         ON wccp.id_lp_cat = wcc.id_parent
       LEFT JOIN order_details AS od
         ON td.id_tutor = od.id_author
       LEFT JOIN orders AS o
         ON od.id_order = o.id_order
       LEFT JOIN tutors_tag_relations AS ttagrels
         ON td.id_tutor = ttagrels.id_tutor
       JOIN tags AS t
         ON ( t.id_tag = ttagrels.id_tag )
             OR ( t.id_tag = lptagrels.id_tag )
             OR ( t.id_tag = wtagrels.id_tag )
WHERE  ( u.country = 'IE'
          OR u.country IN ( 'INT' ) )
       AND CASE
             WHEN ( ( t.id_tag = lptagrels.id_tag )
                    AND ( lp.id_lp > 0 ) ) THEN lp.id_status = 1
                                                AND lp.published = 1
                                                AND lpcp.status = 1
                                                AND ( lpcp.country_code = 'IE'
                                                       OR lpcp.country_code IN (
                                                          'INT'
                                                          ) )
             ELSE 1
           END
       AND CASE
             WHEN ( ( t.id_tag = wtagrels.id_tag )
                    AND ( wc.id_wc > 0 ) ) THEN wc.wc_api_status = 1
                                                AND wc.wc_type = 0
                                                AND
             wc.class_date > '2010-06-01 22:00:56'
                                                AND wccp.status = 1
                                                AND ( wccp.country_code = 'IE'
                                                       OR wccp.country_code IN (
                                                          'INT'
                                                          ) )
             ELSE 1
           END
       AND CASE
             WHEN ( od.id_od > 0 ) THEN od.id_author = td.id_tutor
                                        AND o.order_status = 'paid'
                                        AND CASE
             WHEN ( od.id_wc > 0 ) THEN od.can_attend_class = 1
             ELSE 1
                                            END
             ELSE 1
           END
       AND ( t.tag LIKE "%Dictatorship%"
              OR t.tag LIKE "%Democracy%" )
GROUP  BY td.id_tutor
HAVING key_1_total_matches = 1
       AND key_2_total_matches = 1
ORDER  BY tutor_popularity DESC,
          u.surname ASC,
          u.name ASC
LIMIT  0, 20  

The problem

The results returned by the above query are correct (AND logic working as per expectation), but the time taken by the query rises alarmingly for heavier data and for the current data I have it is like 25 seconds as against normal query timings of the order of 0.005 - 0.0002 seconds, which makes it totally unusable.

It is possible that some of the delay is being caused because all the possible fields have not yet been indexed. The tag field of tags table is indexed. Is there something faulty with the query? What can be the reason behind 20+ seconds of execution time?

How tags work in this system?

  • When a tutor registers, tags are entered and tag relations are created with respect to tutor's details like name, surname etc.
  • When a Tutors create packs, again tags are entered and tag relations are created with respect to pack's details like pack name, description etc.
  • tag relations for tutors stored in tutors_tag_relations and those for packs stored in learning_packs_tag_relations. All individual tags are stored in tags table.

The explain query output:- Please see this screenshot - http://www.test.examvillage.com/Explain_query.jpg

+2  A: 

You may see if it helps adding indexes on following fields:

lptagrels.id_tutor
wtagrels.id_tutor
od.id_author

and then try to get rid of the case-when structures from the where clause. You can add some table specific restrictions directly to join like: left join t2 on t1.id = t2.id AND ... to make code more readable.

EDIT: Seems you have a wrong approach here: you search for all tags and then count the tags that match search. Instead you should search for tags that match and then count results that have these tags.

Imre L
I can understand what you are saying by the wrong approach.If you could alter my query and at least schematically point out how I could do that...Thanks
sandeepan
I have now added " AND ( t.tag LIKE "%Dictatorship%" OR t.tag LIKE "%Democracy%" )" condition at the end of the where clause, but I think it still does not "search for tags first and then count results" like you have suggested. Can you give any idea about how to do a temporary table based query, like ceteras has suggested?
sandeepan
Like you said I am now finding the tags which match in one query and then running the main query on a temporary table. But still facing some issues. Please check if you can help me http://stackoverflow.com/questions/3030022/mysql-help-me-alter-this-search-query-to-get-desired-results
sandeepan
A: 

Take this code out the back and shoot it.

Then start again.

I'm not being flippant but this is horrific and you would do yourself and anyone else touching it in the future a big favour by getting rid of it right now.

Brian
+1  A: 

I've recently had a similar problem. I had to modify a query to implement a new feature, and that meant adding several joins and left joins. The logic was correctly implemented, but it took forever with some bigger tables.

The solution was a complete rewrite, as Brian suggests.

My new approach was something like this:

  • create a temporary table and insert here all relevant data that might end up in the final result set
  • run several updates on this table, joining the required tables one at a time instead of all of them at the same time
  • finally perform a query on this temporary table to extract the end result

All this was done in a stored procedure, the end result has passed unit tests, and is blazing fast.

UPDATE

Please test this query, to see if it returns the same results as the original. If it does, then I will further try to find a better implementation.

SELECT SUM(DISTINCT( t.tag LIKE "%Dictatorship%" )) AS key_1_total_matches,
       SUM(DISTINCT( t.tag LIKE "%democracy%" ))    AS key_2_total_matches,
       COUNT(DISTINCT( od.id_od ))                  AS tutor_popularity,
        (wc.id_wc > 0 
        AND wc.wc_api_status = 1
        AND wc.wc_type = 0
        AND wc.class_date > '2010-06-01 22:00:56'
        AND wccp.status = 1
        AND ( wccp.country_code = 'IE' OR wccp.country_code IN ( 'INT' )) 
        ) AS 'classes_published',
        (lp.id_lp > 0  
        AND lp.id_status = 1
        AND lp.published = 1
        AND lpcp.status = 1
        AND ( lpcp.country_code = 'IE' OR lpcp.country_code IN ( 'INT' ) ) 
        ) AS 'packs_published',
       td . *,
       u . *
FROM   tutor_details AS td JOIN users AS u ON u.id_user = td.id_user
            LEFT JOIN order_details AS od                           ON td.id_tutor = od.id_author
                LEFT JOIN orders AS o                               ON od.id_order = o.id_order
            LEFT JOIN learning_packs_tag_relations AS lptagrels     ON td.id_tutor = lptagrels.id_tutor -- 
               LEFT JOIN learning_packs AS lp                       ON lptagrels.id_lp = lp.id_lp
                   LEFT JOIN learning_packs_categories AS lpc       ON lpc.id_lp_cat = lp.id_lp_cat
                       LEFT JOIN learning_packs_categories AS lpcp  ON lpcp.id_lp_cat = lpc.id_parent
                   LEFT JOIN learning_pack_content AS lpct          ON ( lp.id_lp = lpct.id_lp )
            LEFT JOIN webclasses_tag_relations AS wtagrels          ON td.id_tutor = wtagrels.id_tutor  -- 
               LEFT JOIN webclasses AS wc                           ON wtagrels.id_wc = wc.id_wc
                   LEFT JOIN learning_packs_categories AS wcc       ON wcc.id_lp_cat = wc.id_wp_cat
                       LEFT JOIN learning_packs_categories AS wccp  ON wccp.id_lp_cat = wcc.id_parent
            LEFT JOIN tutors_tag_relations AS ttagrels              ON td.id_tutor = ttagrels.id_tutor -- 
        JOIN tags AS t                                              ON ( t.id_tag = ttagrels.id_tag ) 
                                                                        OR ( t.id_tag = lptagrels.id_tag ) 
                                                                        OR ( t.id_tag = wtagrels.id_tag )
WHERE  ( u.country = 'IE' OR u.country IN ( 'INT' ) )
       AND (NOT ( t.id_tag = lptagrels.id_tag AND lp.id_lp > 0) 
            or (lp.id_status = 1
                AND lp.published = 1
                AND lpcp.status = 1
                AND ( lpcp.country_code = 'IE' OR lpcp.country_code IN ('INT') )
                ) 
            )
       AND (not (t.id_tag = wtagrels.id_tag AND wc.id_wc > 0 )
            or (
                wc.wc_api_status = 1
                AND wc.wc_type = 0
                AND wc.class_date > '2010-06-01 22:00:56'
                AND wccp.status = 1
                AND ( wccp.country_code = 'IE' OR wccp.country_code IN ('INT' ) )
                )
            )
       AND (NOT (od.id_od > 0) 
            OR (
                od.id_author = td.id_tutor
                AND o.order_status = 'paid'
                AND (NOT (od.id_wc > 0) OR od.can_attend_class = 1)
                )
        )
       AND ( t.tag LIKE "%Dictatorship%" OR t.tag LIKE "%Democracy%" )
GROUP  BY td.id_tutor
HAVING key_1_total_matches = 1
       AND key_2_total_matches = 1
ORDER  BY tutor_popularity DESC,
          u.surname ASC,
          u.name ASC
LIMIT  0, 20  
ceteras
Do you mean views by saying temporary tables?
sandeepan
No, as views would not help at all in your case. I mean temporary table, literally.
ceteras
ok I studied temporary tables. If you could at least suggest a schematic approach of the sequence of queries taking my example - what and how much data to store in the temp table (I guess, in my case td.*, u.* etc.)- What updates I should to join the required tables. that could give me a good start!
sandeepan
Sorry for the late answering. Well, the idea is first to put in a temporary table all columns you will be using in the "select" clause. Then you need at least one more column to it, say "eligible tinyint not null default 1". You will first insert raw data in this table, with an "as simple as possible query", the idea is to make it fast, even if some rows in the table would not fit your initial "where" clause. After this, you can eliminate them with updates on the temporary table, setting "eligible=0" where appropriate.Your query is way too complex for me to grasp, but I'll give it a try later
ceteras
I've started studying your query more closer and it appears to me there might be possible to make some changes and retain the logic.It might help for a better understanding. For example, you have constructions like "case when a then b else 1 end" , this equals to (not a) or b. I'd like to rewrite first your query with some changes if you can validate the result on a small database. It will not be faster, but it will help me suggest a better approach.
ceteras
Also there's a "case when a then 1 else 0 end" which equals to "a". Look closer at `AS 'classes_published'` case.
ceteras
sorry I did not see your answers, I will check and reply tomorrow. Thanks for replying. I have meanwhile improved my search logic a little bit. To reduce the complexity and the number of joins, I am now creating a temporary table during search execution time called All_Tag_Relations which contains all the data from Tutors_Tag_Relations and Webclasses_Tag_Relations(ignoring Learning_Packs_Tag_Relations for now). There is still an issue, please check my new question, if you can help http://stackoverflow.com/questions/3030022/mysql-help-me-alter-this-search-query-to-get-desired-results
sandeepan
Hi ceteras, I tried your query and it gives same results as my query.Thanks for trying...
sandeepan