views:

43

answers:

3

I developed a site that uses tags (key words) in order to categorize photographs. Right now, what I have in my mysql database is a table with the following structure:

image_id (int)
tag      (varchar(32))

Every time someone tags an image (if the tag is valid and has enough votes) it's added to the database. I think that this isn't the optimal way of doing things since now that I have 5000+ images with tags, the tags table has over 40000 entries. I fear that this will begin to affect performance (if it's not already affecting it).

I considered this other structure thinking that it'd be faster to fetch the tags associated to a particular image but then it looks horrible for when I want to get all the tags, or the most popular one for instance:

image_id (int)
tags     (text) //comma delimited list of tags for the image

Is there a correct way of doing this or are both ways more or less the same? Thoughts?

Thanks in advance for all your support

+1  A: 

Look over in these questions/posts

ShiVik
+1: Figured there had to be a duplicate, thx
OMG Ponies
A: 

You can make a tags table which is just an id and tag with a unique constraint on tag and then photo_tags table which has tag_id and photo_id. Insert a tag into the tags table only if it doesn't already exist.

Then you will be querying by a pk instead of varchar text comparison when doing queries like how many photos are tagged with a certain tag.

Matt Williamson
+3  A: 

Use a many-to-many table to link a TAG record to an IMAGE record:

IMAGE

DROP TABLE IF EXISTS `example`.`image`;
CREATE TABLE  `example`.`image` (
  `image_id` int(10) unsigned NOT NULL auto_increment,
  PRIMARY KEY  (`image_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

TAG

DROP TABLE IF EXISTS `example`.`tag`;
CREATE TABLE  `example`.`tag` (
 `tag_id` int(10) unsigned NOT NULL auto_increment,
 `description` varchar(45) NOT NULL default '',
 PRIMARY KEY  (`tag_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

IMAGE_TAG_MAP

DROP TABLE IF EXISTS `example`.`image_tag_map`;
CREATE TABLE  `example`.`image_tag_map` (
 `image_id` int(10) unsigned NOT NULL default '0',
 `tag_id` int(10) unsigned NOT NULL default '0',
 PRIMARY KEY  (`image_id`,`tag_id`),
 KEY `tag_fk` (`tag_id`),
 CONSTRAINT `image_fk` FOREIGN KEY (`image_id`) REFERENCES `image` (`image_id`),
 CONSTRAINT `tag_fk` FOREIGN KEY (`tag_id`) REFERENCES `tag` (`tag_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
OMG Ponies
A comma separated list is denormalized data - the `IMAGE_TAG_MAP` is an example of the same information, but normalized.
OMG Ponies