views:

86

answers:

2
+2  Q: 

HABTM Query help

I have a HABTM relationship between 'articles' and 'tags'

Problem: I'm only looking for articles with BOTH the tag 'sports' and 'outdoors' but not articles with only one of these tags.

I tried this:

SELECT DISTINCT article.id, article.name FROM articles
inner JOIN tags ON (tags.name IN ('outdoors', 'sports')
inner JOIN articles_tags ON articles_tags.article_id = article.id AND articles_tags.tag_id = tags.id

...but it gets me articles that are in only sports, only outdoors AND both sports + outdoors

Question what is the right query to use? (I'm using MySQL)

+1  A: 

Try this:

SELECT a1.id, a1.name FROM articles a1
    JOIN tags t1 ON t1.name ='outdoors'
    JOIN articles_tags at1 ON at1.article_id = a1.id AND at1.tag_id = t1.id
    JOIN tags t2 ON t2.name = 'sports'
    JOIN articles_tags at2 ON at2.article_id = a1.id AND at2.tag_id = t2.id
jamesaharvey
A: 

There are two common solutions.

  • The first solution uses GROUP BY to count the tags per article that match 'outdoors' or 'sports' and then returns only the groups that have both tags.

    SELECT a.id, a.name
    FROM articles AS a
    INNER JOIN articles_tags AS at ON (a.id = at.article_id)
    INNER JOIN tags AS t ON (t.id = at.tag_id)
    WHERE t.name IN ('outdoors', 'sports')
    GROUP BY a.id
    HAVING COUNT(DISTINCT t.name) = 2;
    

    This solution appears more readable to some people, and adding values is more straightforward. But GROUP BY queries in MySQL tend to incur a temporary table which harms performance.

  • The other solution uses a JOIN per distinct tag. By using inner joins, the query naturally restricts to articles that match all the tags you specify.

    SELECT a.id, a.name
    FROM articles AS a
    INNER JOIN articles_tags AS at1 ON (a.id = at1.article_id)
    INNER JOIN tags AS t1 ON (t1.id = at1.tag_id AND t1.name = 'outdoors')
    INNER JOIN articles_tags AS at2 ON (a.id = at2.article_id)
    INNER JOIN tags AS t2 ON (t2.id = at2.article_id AND t2.name = 'sports');
    

    Assuming tags.name and articles_tags.(article_id,tag_id) both have UNIQUE constraints, you shouldn't need a DISTINCT query modifier.

    This type of query tends to optimize better on MySQL than the GROUP BY solution, assuming you have defined appropriate indexes.


Re your followup question in the comment, I would do something like this:

SELECT a.id, a.name, GROUP_CONCAT(t3.tag) AS all_tags
FROM articles AS a
INNER JOIN articles_tags AS at1 ON (a.id = at1.article_id)
INNER JOIN tags AS t1 ON (t1.id = at1.tag_id AND t1.name = 'outdoors')
INNER JOIN articles_tags AS at2 ON (a.id = at2.article_id)
INNER JOIN tags AS t2 ON (t2.id = at2.article_id AND t2.name = 'sports');
INNER JOIN articles_tags AS at3 ON (a.id = at3.article_id)
INNER JOIN tags AS t3 ON (t3.id = at3.article_id);
GROUP BY a.id;

This still only finds articles that have both tags 'outdoors' and 'sports', but then it further joins these articles to all its tags.

This will return multiple rows per article (one for each tag) so we then use GROUP BY to reduce down to a single row per article again. GROUP_CONCAT() returns a comma-separated list of the values in the respective group.

Bill Karwin
thanks for the explanation... Also, is it possible within this query to return all the tags associated with the matching articles?
smchacko