views:

71

answers:

3

About the system:

-The system has a total of 8 tables - Users - Tutor_Details (Tutors are a type of User,Tutor_Details table is linked to Users) - learning_packs, (stores packs created by tutors) - learning_packs_tag_relations, (holds tag relations meant for search) - tutors_tag_relations and tags and orders (containing purchase details of tutor's packs), order_details linked to orders and tutor_details.

For a more clear idea about the tables involved please check the The tables section in the end.

-A tags based search approach is being followed.Tag relations are created when new tutors register and when tutors create packs (this makes tutors and packs searcheable). For details please check the section How tags work in this system? below.

Following is a simpler representation (not the actual) of the more complex query which I am trying to optimize:- I have used statements like explanation of parts in the query

select 

SUM(DISTINCT( t.tag LIKE "%Dictatorship%" )) as key_1_total_matches, 
SUM(DISTINCT( t.tag LIKE "%democracy%" )) as key_2_total_matches,
td.*, u.*, count(distinct(od.id_od)), `if (lp.id_lp > 0) then some conditional logic on lp fields else 0 as tutor_popularity`

from Tutor_Details AS td JOIN Users as u on u.id_user = td.id_user 

LEFT JOIN Learning_Packs_Tag_Relations AS lptagrels ON td.id_tutor = lptagrels.id_tutor 
LEFT JOIN Learning_Packs AS lp ON lptagrels.id_lp = lp.id_lp 
LEFT JOIN `some other tables on lp.id_lp - let's call learning pack tables set (including 

Learning_Packs table)`

LEFT JOIN Order_Details as od on td.id_tutor = od.id_author LEFT JOIN Orders as o on 

od.id_order = o.id_order 

LEFT JOIN Tutors_Tag_Relations as ttagrels ON td.id_tutor = ttagrels.id_tutor 

JOIN Tags as t on (t.id_tag = ttagrels.id_tag) OR (t.id_tag = lptagrels.id_tag) 

where `some condition on Users table's fields`

AND CASE WHEN ((t.id_tag = lptagrels.id_tag) AND (lp.id_lp > 0)) THEN `some 

conditions on learning pack tables set` ELSE 1 END

 AND CASE WHEN ((t.id_tag = wtagrels.id_tag) AND (wc.id_wc > 0)) THEN `some 

conditions on webclasses tables set` ELSE 1 END

 AND CASE WHEN (od.id_od>0) THEN od.id_author = td.id_tutor and `some conditions on Orders table's fields` ELSE 1 END

 AND ( t.tag LIKE "%Dictatorship%" OR t.tag LIKE "%democracy%")

group by td.id_tutor HAVING key_1_total_matches = 1 AND key_2_total_matches = 1
order by tutor_popularity desc, u.surname asc, u.name asc limit 
0,20

=====================================================================

What does the above query do?

  • Does AND logic search on the search keywords (2 in this example - "Democracy" and "Dictatorship").
  • Returns only those tutors for which both the keywords are present in the union of the two sets - tutors details and details of all the packs created by a tutor.

To make things clear - Suppose a Tutor name "Sandeepan Nath" has created a pack "My first pack", then:-

  • Searching "Sandeepan Nath" returns Sandeepan Nath.
  • Searching "Sandeepan first" returns Sandeepan Nath.
  • Searching "Sandeepan second" does not return Sandeepan Nath.

======================================================================================

The problem

The results returned by the above query are correct (AND logic working as per expectation), but the time taken by the query on heavily loaded databases is like 25 seconds as against normal query timings of the order of 0.005 - 0.0002 seconds, which makes it totally unusable.

It is possible that some of the delay is being caused because all the possible fields have not yet been indexed, but I would appreciate a better query as a solution, optimized as much as possible, displaying the same results

==========================================================================================

How tags work in this system?

  • When a tutor registers, tags are entered and tag relations are created with respect to tutor's details like name, surname etc.
  • When a Tutors create packs, again tags are entered and tag relations are created with respect to pack's details like pack name, description etc.
  • tag relations for tutors stored in tutors_tag_relations and those for packs stored in learning_packs_tag_relations. All individual tags are stored in tags table.

====================================================================

The tables

Most of the following tables contain many other fields which I have omitted here.

CREATE TABLE IF NOT EXISTS `users` (
  `id_user` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(100) NOT NULL DEFAULT '',
  `surname` varchar(155) NOT NULL DEFAULT '',
  PRIMARY KEY (`id_user`)
  ) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=636 ;

CREATE TABLE IF NOT EXISTS `tutor_details` (
  `id_tutor` int(10) NOT NULL AUTO_INCREMENT,
  `id_user` int(10) NOT NULL DEFAULT '0',
  PRIMARY KEY (`id_tutor`),
  KEY `Users_FKIndex1` (`id_user`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1 AUTO_INCREMENT=51 ;



CREATE TABLE IF NOT EXISTS `orders` (
  `id_order` int(10) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`id_order`),
  KEY `Orders_FKIndex1` (`id_user`),
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=275 ;

ALTER TABLE `orders`
  ADD CONSTRAINT `Orders_ibfk_1` FOREIGN KEY (`id_user`) REFERENCES `users` 

(`id_user`) ON DELETE NO ACTION ON UPDATE NO ACTION;



CREATE TABLE IF NOT EXISTS `order_details` (
  `id_od` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `id_order` int(10) unsigned NOT NULL DEFAULT '0',
  `id_author` int(10) NOT NULL DEFAULT '0',
  PRIMARY KEY (`id_od`),
  KEY `Order_Details_FKIndex1` (`id_order`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=284 ;

ALTER TABLE `order_details`
  ADD CONSTRAINT `Order_Details_ibfk_1` FOREIGN KEY (`id_order`) REFERENCES `orders` 

(`id_order`) ON DELETE NO ACTION ON UPDATE NO ACTION;



CREATE TABLE IF NOT EXISTS `learning_packs` (
  `id_lp` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `id_author` int(10) unsigned NOT NULL DEFAULT '0',
  PRIMARY KEY (`id_lp`),
  KEY `Learning_Packs_FKIndex2` (`id_author`),
  KEY `id_lp` (`id_lp`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=23 ;


CREATE TABLE IF NOT EXISTS `tags` (
  `id_tag` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `tag` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id_tag`),
  UNIQUE KEY `tag` (`tag`),
  KEY `id_tag` (`id_tag`),
  KEY `tag_2` (`tag`),
  KEY `tag_3` (`tag`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1 AUTO_INCREMENT=3419 ;



CREATE TABLE IF NOT EXISTS `tutors_tag_relations` (
  `id_tag` int(10) unsigned NOT NULL DEFAULT '0',
  `id_tutor` int(10) DEFAULT NULL,
  KEY `Tutors_Tag_Relations` (`id_tag`),
  KEY `id_tutor` (`id_tutor`),
  KEY `id_tag` (`id_tag`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

ALTER TABLE `tutors_tag_relations`
  ADD CONSTRAINT `Tutors_Tag_Relations_ibfk_1` FOREIGN KEY (`id_tag`) REFERENCES 

`tags` (`id_tag`) ON DELETE NO ACTION ON UPDATE NO ACTION;


CREATE TABLE IF NOT EXISTS `learning_packs_tag_relations` (
  `id_tag` int(10) unsigned NOT NULL DEFAULT '0',
  `id_tutor` int(10) DEFAULT NULL,
  `id_lp` int(10) unsigned DEFAULT NULL,
  KEY `Learning_Packs_Tag_Relations_FKIndex1` (`id_tag`),
  KEY `id_lp` (`id_lp`),
  KEY `id_tag` (`id_tag`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

ALTER TABLE `learning_packs_tag_relations`
  ADD CONSTRAINT `Learning_Packs_Tag_Relations_ibfk_1` FOREIGN KEY (`id_tag`) 

REFERENCES `tags` (`id_tag`) ON DELETE NO ACTION ON UPDATE NO ACTION;

===================================================================================

Following is the exact query (this includes classes also - tutors can create classes and search terms are matched with classes created by tutors):-

SELECT SUM(DISTINCT( t.tag LIKE "%Dictatorship%" )) AS key_1_total_matches,
       SUM(DISTINCT( t.tag LIKE "%democracy%" ))    AS key_2_total_matches,
       COUNT(DISTINCT( od.id_od ))                  AS tutor_popularity,
       CASE
         WHEN ( IF(( wc.id_wc > 0 ), ( wc.wc_api_status = 1
                                       AND wc.wc_type = 0
                                       AND wc.class_date > '2010-06-01 22:00:56'
                                       AND wccp.status = 1
                                       AND ( wccp.country_code = 'IE'
                                              OR wccp.country_code IN ( 'INT' )
                                           ) ), 0)
              ) THEN 1
         ELSE 0
       END                                          AS 'classes_published',
       CASE
         WHEN ( IF(( lp.id_lp > 0 ), ( lp.id_status = 1
                                       AND lp.published = 1
                                       AND lpcp.status = 1
                                       AND ( lpcp.country_code = 'IE'
                                              OR lpcp.country_code IN ( 'INT' )
                                           ) ), 0)
              ) THEN 1
         ELSE 0
       END                                          AS 'packs_published',
       td . *,
       u . *
FROM   tutor_details AS td
       JOIN users AS u
         ON u.id_user = td.id_user
       LEFT JOIN learning_packs_tag_relations AS lptagrels
         ON td.id_tutor = lptagrels.id_tutor
       LEFT JOIN learning_packs AS lp
         ON lptagrels.id_lp = lp.id_lp
       LEFT JOIN learning_packs_categories AS lpc
         ON lpc.id_lp_cat = lp.id_lp_cat
       LEFT JOIN learning_packs_categories AS lpcp
         ON lpcp.id_lp_cat = lpc.id_parent
       LEFT JOIN learning_pack_content AS lpct
         ON ( lp.id_lp = lpct.id_lp )
       LEFT JOIN webclasses_tag_relations AS wtagrels
         ON td.id_tutor = wtagrels.id_tutor
       LEFT JOIN webclasses AS wc
         ON wtagrels.id_wc = wc.id_wc
       LEFT JOIN learning_packs_categories AS wcc
         ON wcc.id_lp_cat = wc.id_wp_cat
       LEFT JOIN learning_packs_categories AS wccp
         ON wccp.id_lp_cat = wcc.id_parent
       LEFT JOIN order_details AS od
         ON td.id_tutor = od.id_author
       LEFT JOIN orders AS o
         ON od.id_order = o.id_order
       LEFT JOIN tutors_tag_relations AS ttagrels
         ON td.id_tutor = ttagrels.id_tutor
       JOIN tags AS t
         ON ( t.id_tag = ttagrels.id_tag )
             OR ( t.id_tag = lptagrels.id_tag )
             OR ( t.id_tag = wtagrels.id_tag )
WHERE  ( u.country = 'IE'
          OR u.country IN ( 'INT' ) )
       AND CASE
             WHEN ( ( t.id_tag = lptagrels.id_tag )
                    AND ( lp.id_lp > 0 ) ) THEN lp.id_status = 1
                                                AND lp.published = 1
                                                AND lpcp.status = 1
                                                AND ( lpcp.country_code = 'IE'
                                                       OR lpcp.country_code IN (
                                                          'INT'
                                                          ) )
             ELSE 1
           END
       AND CASE
             WHEN ( ( t.id_tag = wtagrels.id_tag )
                    AND ( wc.id_wc > 0 ) ) THEN wc.wc_api_status = 1
                                                AND wc.wc_type = 0
                                                AND
             wc.class_date > '2010-06-01 22:00:56'
                                                AND wccp.status = 1
                                                AND ( wccp.country_code = 'IE'
                                                       OR wccp.country_code IN (
                                                          'INT'
                                                          ) )
             ELSE 1
           END
       AND CASE
             WHEN ( od.id_od > 0 ) THEN od.id_author = td.id_tutor
                                        AND o.order_status = 'paid'
                                        AND CASE
             WHEN ( od.id_wc > 0 ) THEN od.can_attend_class = 1
             ELSE 1
                                            END
             ELSE 1
           END
GROUP  BY td.id_tutor
HAVING key_1_total_matches = 1
       AND key_2_total_matches = 1
ORDER  BY tutor_popularity DESC,
          u.surname ASC,
          u.name ASC
LIMIT  0, 20  

Please note - The provided database structure does not show all the fields and tables as in this query

The explain query output:- Please see this screenshot http://www.test.examvillage.com/Explain_query.jpg

A: 

Information on row counts, value distributions, indexes, size of the database, size of memory, disk layout - raid 0, 5, etc - how many users are hitting your database when queries are slow - what other queries are running. All these things factor into performance.

Also a print out of the explain plan output may shed some light on the cause if it's simply a query / index issue. The exact query would be needed as well.

Khorkrak
Hello Khorkrak Thanks for answering my long question. Please check the complete query at the end of my question
sandeepan
A: 
  1. You really should use some better formatting for the query. Just add at least 4 spaces to the beginning of each row to get this nice code formatting.

    SELECT * FROM sometable
        INNER JOIN anothertable ON sometable.id = anothertable.sometable_id
    

    Or have a look here: http://stackoverflow.com/editing-help

  2. Could you provide the execution plan from mysql? You need to add "EXPLAIN" to the query and copy the result.

    EXPLAIN SELECT * FROM ...complexquery...
    

    will give you some useful hints (execution order, returned rows, available/used indexes)

SchlaWiener
Hello SchlaWiener, I could not indent the queries but have put them in clear separate blocks now.Also, I have copied and pasted the explain query output at the end of my answer. Sorry I do not even have ms paint in my system to take snapshot. Why is there no file attachment allowed in stackoverflow I dont know.Please see if you can give some idea.Thanks Sandeepan
sandeepan
please see if you can answer my new question http://stackoverflow.com/questions/3030022/mysql-help-me-alter-this-search-query-to-get-desired-results
sandeepan
A: 

Your question is, "how can I find tutors that match certain tags?" That's not a hard question, so the query to answer it shouldn't be hard either.

Something like:

SELECT *
FROM tutors
WHERE tags LIKE '%Dictator%' AND tags LIKE '%Democracy%'

That will work, if you modify your design to have a "tags" field in your "tutors" table, in which you put all the tags that apply to that tutor. It will eliminate layers of joins and tables.

Are all those layers of joins and tables providing real functionality, or just more programming headaches? Think about the functionality that your app REALLY needs, and then simplify your database design!!

Summer