Hello, guys! I have a problem and I dont know what is better solution. Okay, I have 2 tables: posts(id, title), posts_tags(post_id, tag_id). I have next task: must select posts with tags ids for example 4, 10 and 11. Not exactly, post could have any other tags at the same time. So, how I could do it more optimized? Creating temporary table in each query? Or may be some kind of stored procedure? In the future, user could ask script to select posts with any count of tags (it could be 1 tag only or 10 at the same time) and I must be sure that method that I will choose would be the best method for my problem. Sorry for my english, thx for attention.
select id, title
from posts p, tags t
where p.id = t.post_id
and tag_id in ( 4,10,11 ) ;
?
Does this work?
select *
from posts
where post.post_id in
(select post_id
from post_tags
where tag_id = 4
and post_id in (select post_id
from post_tags
where tag_id = 10
and post_id in (select post_id
from post_tags
where tag_id = 11)))
This solution assumes that (post_id, tag_id) in post_tags is enforced to be UNIQUE:
SELECT id, title FROM posts
INNER JOIN post_tag ON post_tag.post_id = posts.id
WHERE tag_id IN (4, 6, 10)
GROUP BY id, title
HAVING COUNT(*) = 3
Although it's not a solution for all possible tag combinations, it's easy to create as dynamic SQL. To change for other sets of tags, change the IN () list to have all the tags, and the COUNT(*) = to check for the number of tags specified. The advantage of this solution over cascading a bunch of JOINs together is that you don't have to add JOINs, or even extra WHERE terms, when you change the request.
You can do a time-storage trade-off by storing a one-way hash of the post's tag names sorted alphabetically.
When a post is tagged, execute select t.name from tags t inner join post_tags pt where pt.post_id = [ID_of_tagged_post] order by t.name
. Concatenate all of the tag names, create a hash using the MD5 algorithm and insert the value into a column alongside your post (or into another table joined by a foreign key, if you prefer).
When you want to search for a specific combination of tags, simply execute (remembering to sort the tag names) select from posts p where p.taghash = MD5([concatenated_tag_string])
.
This selects all posts that have any of the tags (4, 10, 11):
select distinct id, title from posts
where exists (
select * from posts_tags
where
post_id = id and
tag_id in (4, 10, 11))
Or you can use this:
select distinct id, title from posts
join posts_tags on post_id = id
where tag_id in (4, 10, 11)
(Both will be optimized the same way).
This selects all posts that have all of the tags (4, 10, 11):
select distinct id, title from posts
where not exists (
select * from posts_tags t1
where
t1.tag_id in (4, 10, 11) and
not exists (
select * from posts_tags as t2
where
t1.tag_id = t2.tag_id and
id = t2.post_id))
The list of tags in the in
clause is what dynamically changes (in all cases).
But, this last query is not really fast, so you could use something like this instead:
create temporary table target_tags (tag_id int);
insert into target_tags values(4),(10),(11);
select id, title from posts
join posts_tags on post_id = id
join target_tags on target_tags.tag_id = posts_tags.tag_id
group by id, title
having count(*) = (select count(*) from target_tags);
drop table target_tags;
The part that changes dynamically is now in the second statement (the insert).