views:

2452

answers:

1

I have 2 tables:

  • posts
  • tags

Tags table is structured like this:

  • post_id
  • tag

So for every tag that's given for a post, I create a record in the tags table. If a post has 10 tags, there will be 10 records in tags table with that post_id.

I'm now trying to build a search page where users can do a search for posts where tags do not contain the given keywords. This creates a problem though. A query like:

SELECT DISTINCT posts.id, posts.title, posts.content
   FROM jobs, tags
   WHERE tags.tag NOT LIKE '%$keywords%' AND posts.id=tags.post_id

doesn't work because if a post has got 6 tags and one of them has got the keyword, it will still be returned because the other 5 records in the tags table don't have that keyword.

What's the best way to solve this? Any way other than creating a new column in the posts table which stores all the comma-separated tags used only for search??

+4  A: 

Make sure you have indexes or this will perform very badly:

SELECT posts.id, posts.title, posts.content
FROM posts 
WHERE NOT EXISTS (
  SELECT post_id from tags
  WHERE tags.tag LIKE '%$keywords%' 
    AND posts.id=tags.post_id
)

This gets a list of all posts, excluding those that have a tag matching the tag you specified. (Your orginal query referenced a 'jobs' table. I assumed that was a typo for 'posts'.)

Table aliases make this a little cleaner:

SELECT p.id, p.title, p.content
FROM posts p
WHERE NOT EXISTS (
  SELECT t.post_id from tags t
  WHERE t.tag LIKE '%$keywords%' 
    AND p.id=t.post_id
)

Then, I'd create these indexes:

Posts: id, tag_id
Tags: post_id, tag

Then, run your query with 'explain' to see if it's performing well. Update your question with the results and someone will offer further advice. Index tuning is more trial and error than anything else so testing really is necessary.

Michael Haren
I was just in the process of posting that... :(
Jon
Should I create index on the tag column?
Click Upvote
An index on the tag column will not help as you are text-searching all records. You need indexes on your post_id columns.
ck
I'd create these indexes: Posts{id, tag_id}; Tags{post_id, tag}. Then, run your query with 'explain plan' to see if it's performing well.
Michael Haren