views:

136

answers:

5
+3  Q: 

many-to-many query

Hello, guys! I have a problem and I dont know what is better solution. Okay, I have 2 tables: posts(id, title), posts_tags(post_id, tag_id). I have next task: must select posts with tags ids for example 4, 10 and 11. Not exactly, post could have any other tags at the same time. So, how I could do it more optimized? Creating temporary table in each query? Or may be some kind of stored procedure? In the future, user could ask script to select posts with any count of tags (it could be 1 tag only or 10 at the same time) and I must be sure that method that I will choose would be the best method for my problem. Sorry for my english, thx for attention.

+1  A: 
select id, title
from posts p, tags t
where p.id = t.post_id
and tag_id in ( 4,10,11 ) ;

?

Randy
It could return posts with tag 4 OR tag 10 OR tag 11. But I need exactly all this three tags in one post.The problem is here :)
A: 

Does this work?

select *
from posts
where post.post_id in
    (select post_id
    from post_tags
    where tag_id = 4
    and post_id in (select post_id
                    from post_tags
                    where tag_id = 10
                    and post_id in (select post_id
                                    from post_tags
                                    where tag_id = 11)))
FrustratedWithFormsDesigner
+3  A: 

This solution assumes that (post_id, tag_id) in post_tags is enforced to be UNIQUE:

 SELECT id, title FROM posts
    INNER JOIN post_tag ON post_tag.post_id = posts.id
    WHERE tag_id IN (4, 6, 10)
    GROUP BY id, title
    HAVING COUNT(*) = 3

Although it's not a solution for all possible tag combinations, it's easy to create as dynamic SQL. To change for other sets of tags, change the IN () list to have all the tags, and the COUNT(*) = to check for the number of tags specified. The advantage of this solution over cascading a bunch of JOINs together is that you don't have to add JOINs, or even extra WHERE terms, when you change the request.

Larry Lustig
+1 For using GROUP BY with HAVING.
Joop
A: 

You can do a time-storage trade-off by storing a one-way hash of the post's tag names sorted alphabetically.

When a post is tagged, execute select t.name from tags t inner join post_tags pt where pt.post_id = [ID_of_tagged_post] order by t.name. Concatenate all of the tag names, create a hash using the MD5 algorithm and insert the value into a column alongside your post (or into another table joined by a foreign key, if you prefer).

When you want to search for a specific combination of tags, simply execute (remembering to sort the tag names) select from posts p where p.taghash = MD5([concatenated_tag_string]).

BigZig
A: 

This selects all posts that have any of the tags (4, 10, 11):

select distinct id, title from posts  
where exists ( 
  select * from posts_tags  
  where  
    post_id = id and 
    tag_id in (4, 10, 11)) 

Or you can use this:

select distinct id, title from posts   
join posts_tags on post_id = id 
where tag_id in (4, 10, 11) 

(Both will be optimized the same way).

This selects all posts that have all of the tags (4, 10, 11):

select distinct id, title from posts
where not exists ( 
  select * from posts_tags t1 
  where 
    t1.tag_id in (4, 10, 11) and
    not exists (
      select * from posts_tags as t2
      where 
        t1.tag_id = t2.tag_id and
        id = t2.post_id))

The list of tags in the in clause is what dynamically changes (in all cases).

But, this last query is not really fast, so you could use something like this instead:

 create temporary table target_tags (tag_id int);
 insert into target_tags values(4),(10),(11);
 select id, title from posts 
   join posts_tags on post_id = id 
   join target_tags on target_tags.tag_id = posts_tags.tag_id
   group by id, title 
   having count(*) = (select count(*) from target_tags);
 drop table target_tags;

The part that changes dynamically is now in the second statement (the insert).

Jordão
This will select posts with 1, 2, or 3 of the desired tags, not all three. And it would be more clearly written (and execute faster) if expressed as a JOIN.
Larry Lustig
I've added code to select posts that have all the tags.
Jordão
I've also added the join code for the first case. Although, a decent query optimizer will treat it the same as the query with the exists clause.
Jordão