tags:

views:

65

answers:

2

Hello,

I have the following tables;

CREATE TABLE IF NOT EXISTS `tags` (
  `tag_id` int(11) NOT NULL auto_increment,
  `tag_text` varchar(255) NOT NULL,
  PRIMARY KEY  (`tag_id`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1 AUTO_INCREMENT=9 ;


CREATE TABLE IF NOT EXISTS `users` (
  `user_id` int(11) NOT NULL auto_increment,
  `user_display_name` varchar(128) default NULL,
  PRIMARY KEY  (`user_id`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1 AUTO_INCREMENT=10 ;

CREATE TABLE IF NOT EXISTS `user_post_tag` (
  `upt_id` int(11) NOT NULL auto_increment,
  `upt_user_id` int(11) NOT NULL,
  `upt_post_id` int(11) NOT NULL,
  `upt_tag_id` int(11) NOT NULL,
  PRIMARY KEY  (`upt_id`),
  KEY `upt_user_id` (`upt_user_id`),
  KEY `upt_post_id` (`upt_post_id`),
  KEY `upt_tag_id` (`upt_tag_id`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1 AUTO_INCREMENT=9 ;

CREATE TABLE IF NOT EXISTS `view_post` (
`post_id` int(11)
,`post_url` varchar(255)
,`post_text` text
,`post_title` varchar(255)
,`post_date` datetime
,`user_id` int(11)
,`user_display_name` varchar(128)
);

The idea is that I would like to use the most effective way to save tags, for a post and users. Simply once I add a post I pass few tags along that post and user. Later I would like to be able to count tabs for each user and post. Something very similar to Stack Overflow.

I suppose that the 'tag_text' should be unique? Is if effective that I run a function each time I submit a new post to go through the 'tags' table to check if a tag already exists, and if yes, return its 'tag_id' so I can insert it into 'user_post_tag' table.

Is this maybe a bad approach to tackle this kind of issue.

All suggestions are welcome.

A: 

Hmmm, if your tags are all unique, then you don't need tag_id and tag_text in the tags table. Just use tag_text and make it the primary key. Then look at REPLACE INTO (http://dev.mysql.com/doc/refman/5.0/en/replace.html) to handle new tags.

Associating tags with users or posts? user_tags table and post_tags table. no auto-increment values just a compound key with user_id and tag_text or post_id and tag_text. I don't know if you're looking at the user_post_tags table for a performance increase over joining a post_tags table with posts and users. Still, "replace into" should be your friend here too.

Mark Moline
I'd recommend sticking to numeric keys. If you make the text field into a unique key, the `REPLACE INTO` trick will still work, and it will make renaming an entire tag much much easier.
nickf
+2  A: 

Yes, what you are doing is the best way to do it. You created an n to m relationship, as a post can have multiple tags and the same tag can be on multiple posts. You do not want to store the tag name for each of the posts, so you store the id.

But, you should -NOT- have this redudancy of storing multiple times the same tag_id for the same user. It will hit hard your server if the users have multiple tags and you have to execute SELECT count(...) for each of these tags. Do you understand what I'm talking about here? Because right now, how would get how many times the user A has the tag B? You'd have to do SELECT count(*) FROM user_post_tag INNER JOIN tags ON (...) WHERE user_id=A and tag_id=B.

My suggestion is to split user_post_tag into two tables:

  1. user_tags, to count how many times the user has this tag, primary key would be user_id and tag_id and you'd have a count field, which you would just update with count=count+1 everytime this user makes a new post with the tag. This way, you can simply do SELECT tag_text, count FROM user_tags INNER JOIN tags ON (...) WHERE user_id=A to select all tags (with number of times used) of a given user. You're using a fully indexed query. You're not asking MySQL to go over the table, look for a bunch of rows and count them, you're telling to MySQL, go this row at this table and at the other table, join them and give it to me, fast!
  2. post_tags, to store the tags a certain post have, primary key would be post_id and tag_id, no additional fields needed.

I suppose that the 'tag_text' should be unique? Is if effective that I run a function each time I submit a new post to go through the 'tags' table to check if a tag already exists, and if yes, return its 'tag_id' so I can insert it into 'user_post_tag' table.

Yes, it should be unique. It's way better to check if a tag exists before inserting and inserting if it doesn't than having redundancy and having to do SELECT ... count(*) to know how much times the tag has been used. It will be much mess less frequent post creation than post selection, so if you have to pick between being query intensive on insertion and selection, certainly pick insertion.

By the way, if you'd like to have a count of how many posts have the same tag, like in stack overflow, you'd need another table, with primary key tag_id, and then, like on user_tags, you increment the count field everytime a post gets a certain tag.

Clash