tags:

views:

57

answers:

2

This is kinda weird. I have the next query:

SELECT * , GROUP_CONCAT( x.tag
SEPARATOR ',' ) AS tags
FROM tag AS t, tag AS x, tag_message_rel AS r, message m
INNER JOIN `user` AS u ON m.user_id = u.id
WHERE t.tag
IN (
'kikikiki', 'dsa'
)
AND m.id = r.message_id
AND t.id = r.tag_id
AND x.id = r.tag_id
GROUP BY m.id
HAVING COUNT( * ) >=2
ORDER BY m.created_at DESC
LIMIT 0 , 20

As you can see i use t to join find the messages that i want, on the other side i use x to print the tags of a message. I i erase the line:

AND x.id = r.tag_id

I will get the messages that i want, but tags will have ALL the tags in the tags table separated by coma. If i leave the line there, i will only get those 2 tags. If i use explain i get:

id  select_type  table  type  possible_keys  key  key_len  ref  rows  Extra
1   SIMPLE  u  system  PRIMARY  NULL  NULL  NULL  1  Using temporary; Using filesort
1   SIMPLE  t  range  PRIMARY,tag  tag  252  NULL  2  Using where
1   SIMPLE  x  eq_ref  PRIMARY  PRIMARY  4  verse.t.id  1   
1   SIMPLE  r  ALL  NULL  NULL  NULL  NULL  180  Using where; Using join buffer
1   SIMPLE  m  eq_ref  PRIMARY  PRIMARY  4  verse.r.message_id  1  Using where

Now im no expert in this, but i think the problem is that it is refusing to re-join a table in the process of optimizing the query.

What do you think? Any quick fix?

A: 

You only get those two tags, because:

  1. t.tag IN ('kikikiki', 'dsa'). 2 t.id IN (id-for-kikikiki, id-for-dsa)
  2. t.id = r.tag_id
  3. r.tag_id = x.id
  4. Combining 3 & 4, t.id = x.id
  5. Combining everything, x.tag IN ('kikikiki', 'dsa')

I think what you want is to join a second tag_message_rel. Join t to the first one; join x to the second one; join them both to message. Not completely sure this is what you want, as you didn't actually say...

derobert
+1  A: 

The problem is that you are trying to join to the tag table twice, but you really need to join to the tag_message_rel table twice, and from each of these to the respective row in the tag table.

Think of "table aliases" as referring to a row in a table, not the table itself. That idea helped me to understand complex joins a lot better.

Here's how I'd write that query:

SELECT m.*, u.*, GROUP_CONCAT(DISTINCT x.tag) AS tags
FROM message m
 JOIN `user` u ON (u.id = m.user_id)
 JOIN tag_message_rel r1 ON (m.id = r1.message_id)
 JOIN tag t ON (t.id = r1.tag_id)
 JOIN tag_message_rel r2 ON (m.id = r2.message_id)
 JOIN tag x ON (x.id = r2.tag_id)
WHERE t.tag IN ('kikikiki', 'dsa')
GROUP BY m.id
HAVING COUNT(DISTINCT t.tag) = 2
ORDER BY m.created_at DESC
LIMIT 0 , 20;

You should develop the habit of using JOIN syntax consistently. Mixing JOIN and comma-style joins can cause some subtle problems.

Here's an alternative query, that pulls some of the joins into a non-correlated subquery, so you avoid a Cartesian product between t and x, and eliminate the DISTINCT modifiers in the group functions.

SELECT m.*, u.*, GROUP_CONCAT(x.tag) AS tags
FROM message m
 JOIN `user` u ON (u.id = m.user_id)
 JOIN tag_message_rel r ON (m.id = r.message_id)
 JOIN tag x ON (x.id = r.tag_id)
WHERE m.id = ANY (
    SELECT m2.id 
    FROM message m2 
     JOIN tag_message_rel r2 ON (m2.id = r2.message_id)
     JOIN tag t ON (t.id = r2.tag_id)
    WHERE t.tag IN ('kikikiki', 'dsa') 
    GROUP BY m2.id 
    HAVING COUNT(t.tag) = 2)
GROUP BY m.id
ORDER BY m.created_at DESC
LIMIT 0 , 20;
Bill Karwin