tags:

views:

835

answers:

7
+1  Q: 

MINUS in MySQL?

I have topics(id*) and tags(id*,name) and a linking table topic_tags(topicFk*,tagFk*).

Now I want to select every single topic, that has all of the good tags (a,b,c) but none of the bad tags (d,e,f).

How do I do that?

+2  A: 

Here's a solution that would work, but requires a join for every tag you require.

SELECT *
FROM topics
WHERE topic_id IN
    (SELECT topic_id
    FROM topic_tags a
    INNER JOIN topic_tags b
      on a.topic_id=b.topic_id
      and b.tag = 'b'
    INNER JOIN topic_tags c
      on b.topic_id=c.topic_d
      and c.tag = 'c'
    WHERE a.tag = 'a')
AND topic_id NOT IN
    (SELECT topic_id
    FROM topic_tags
    WHERE tag = 'd' or tag = 'e' or tag = 'f')
Tom Ritter
A: 

Not completely sure I understand, and I hope there's a better way to do the good tags part, but:

select id from topic
    inner join topic_tags tta on topic.id=tta.topicFk and tta.tagFk=a
    inner join topic_tags ttb on topic.id=ttb.topicFk and ttb.tagFk=b
    inner join topic_tags ttc on topic.id=ttc.topicFk and ttc.tagFk=c
    left join topic_tags tt on topic.id=tt.topicFk and tt.tagFk in (d,e,f)
    where tt.topicFk is null;

Update: something like this:

select id from topic
    left join topic_tags tt on topic.id=tt.topicFk and tt.tagFk in (d,e,f)
    where tt.topicFk is null and
        3=(select count(*) from topic_tags where topicFk=topic.id and tagFk in (a,b,c));

I see one answer assuming a,b,c,d,e,f are names, not ids. If so, then this:

select id from topic
    left join topic_tags tt on topic.id=tt.topicFk
        inner join tags on tt.tagFk=tags.id and tags.name in (d,e,f)
    where tt.topicFk is null and
       3=(select count(*) from tags inner join topic_tags on tags.id=topic_tags.tagFk and topic_tags.topicFk=topic.id where tags.name in (a,b,c));
ysth
+1  A: 

As wrote this 3 other answers came in, but this is different so I'll post it anyway.

The idea is to select all topics with have a,b,c tags, then identify those topics that also have d,e,f with a left join, and then filter those out with a where clause looking for nulls on that join...

select distinct topics.id from topics 
inner join topic_tags as t1 
    on (t1.topicFK=topics.id)
inner join tags as goodtags 
    on(goodtags.id=t1.tagFK and goodtags.name in ('a', 'b', 'c'))
left join topic_tags as t2 
    on (t2.topicFK=topics.id)
left join tags as badtags 
    on(badtags .id=t2.tagFK and batags.name in ('d', 'e', 'f'))
where badtags.name is null;

Totally untested, but hopefully you see where the logic is coming from.

Paul Dixon
Coming back to this, I should add this answers the question "which topics have least one of a,b or c tags, but not one d,e or f tags" - but the question was phrased as needing *all* the a,b,c tags. I see Marc has posted a solution which neatly corrects for this by cunningly counting the good tags.
Paul Dixon
A: 

You can use the minus keyword, to filter out topics with undesired tags.

-- All topics with desired tags.
select distinct T.*
from Topics T inner join Topics_Tags R on T.id = R.topicFK
              inner join Tags U on U.id = R.topic=FK
where U.name in ('a', 'b', 'c')

minus

-- All topics with undesired tags. These are filtered out.
select distinct T.*
from Topics T inner join Topics_Tags R on T.id = R.topicFK
              inner join Tags U on U.id = R.topic=FK
where U.name in ('d', 'e', 'f')
Pablo
MySQL 5.1 does not have a 'minus' keyword
Marcel
+4  A: 

Assuming your Topic_Tags table is unique, this answers your exact question - but may not be generalizable to your actual problem:

SELECT
  TopicId
FROM Topic_Tags
JOIN Tags ON
  Topic_Tags.TagId = Tags.TagId
WHERE
  Tags.Name IN ('A', 'B', 'C', 'D', 'E', 'F')
GROUP BY
  TopicId
HAVING
  COUNT(*) = 3 
  AND MAX(Tags.Name) = 'C'

A more general solution would be:

SELECT 
    * 
FROM (
    SELECT
        TopicId
    FROM Topic_Tags
    JOIN Tags ON
        Topic_Tags.TagId = Tags.TagId
    WHERE
        Tags.Name IN ('A', 'B', 'C')
    GROUP BY
        TopicId
    HAVING
        COUNT(*) = 3 
) as GoodTags
LEFT JOIN (
    SELECT
        TopicId
    FROM Topic_Tags
    JOIN Tags ON
        Topic_Tags.TagId = Tags.TagId
    WHERE
        Tags.Name = 'D'
        OR Tags.Name = 'E'
        OR Tags.Name = 'F'
) as BadTags ON
    GoodTags.TopicId = BadTags.TopicId
WHERE
    BadTags.TopicId IS NULL
Mark Brackett
+1  A: 

Here's another alternative query. Maybe it's more clear and convenient to have the list of good and bad tags up at the top. I tested this on MySQL 5.0.

SELECT t.*, 
  SUM(CASE WHEN g.name IN ('a', 'b', 'c') THEN 1 ELSE 0 END) AS num_good_tags,
  SUM(CASE WHEN g.name IN ('d', 'e', 'f') THEN 1 ELSE 0 END) AS num_bad_tags
FROM topics AS t
 JOIN topic_tags AS tg ON (t.id = tg.topicFk)
 JOIN tags AS g ON (g.id = tg.tagFk)
GROUP BY t.id
HAVING num_good_tags = 3 AND num_bad_tags = 0;
Bill Karwin
Easier: sum(if(g.name in ('a','b','c'),1,0))Even easier: sum(g.name in ('a','b','c'))
ysth
@ysth: Yes, you're right. I like to use standard SQL predicates where possible (IF is not standard SQL), and I like to be explicit about 1 versus 0 instead of relying on boolean expressions being equal to integer values.
Bill Karwin
A: 

My own solution using Pauls and Bills ideas.

The idea is to inner join topics with good tags (to throw out topics with no good tags) and then count the unique tags for each topic (to verify that all the good tags are present).

At the same time an outer join with bad tags should have not a single match (all fields are NULL).

SELECT topics.id
FROM topics
  INNER JOIN topic_tags topic_ptags
    ON topics.id = topic_ptags.topicFk
  INNER JOIN tags ptags
    ON topic_ptags.tagFk = ptags.id
      AND ptags.name IN ('a','b','c')
  LEFT JOIN topic_tags topic_ntags
    ON topics.id = topic_ntags.topicFk
  LEFT JOIN tags ntags
    ON topic_ntags.tagFk = ntags.id
      AND ntags.name IN ('d','e','f')
GROUP BY topics.id
HAVING count(DISTINCT ptags.id) = 3
  AND count(ntags.id) = 0
Marcel