views:

313

answers:

3

I'm implementing a tag system similar to StackOverflow tag system. I was thinking about when storing the tags and relating to a question, that relationship will be directly with the tag name or it's better create a field tagID to "link" the question with the tag? Looks that linking directly to tag name is easier, but it doesn't look good, mainly why when working with statistics and/or tag categorization (IMHO) can be hard to manage this. Another problem is when one admin decides "fix" a tag name. If there isn't a tagID separated from tag name, then I will be changing the key of the table...

What's your thoughts?

Thanks for all replies. I will delete this post since there is another posts with the same subject. I wonder why the search and the suggestion doesn't show it results for me...

A: 

If you foresee many tags, and are using a relational database, using an ID that the database supports natively (e.g. RID) internally may just give you better performance.

If that's not a concern: go by simple short tag names. You can give the tags long names which will be displayed in the user interface too where it makes sense (e.g. ask the user for one when creating a new tag). You are more likely to have to edit the long names, which nothing refers to directly, so this is not a problem.

Aside, if you are using a relational database, it is probably not very difficult to change a tag name together with all its references with a simple query, it may just be a slightly more expensive operation, but it is probably not going to be done frequently enough that you need to optimize for it. And consider that you may have duplicate tags that you will want to merge too, so you might want to be able to do that anyway.

Tom Alsberg
Asking users to provide both a short and long version when creating a tag will just lead to confusion. Best stick with either just the tag name or artificial key like you suggested.
Macka
I agree that it may lead to confusion - this depends on the site and the users it is geared towards. This can be optional, needs to be done only at tag creation time though, so for the benefit of user-friendly names it may be worthwhile.
Tom Alsberg
+3  A: 

Your last sentence in your question seems to answer it. Assuming the tags are stored in a tag table, I would always have an ID column (int or GUID) and the varchar/string column for the tag name. The many-to-many (junction table) that would relate some other entity to one or more tags would have two columns containing the ID's the "other entity" and the tag's ID. It's then easy to edit a tag (to correct a mis-spelling for example) without touching the key. You should get much better performance when using queries that include joins with your junction table and it also means you're normalizing your data better.

Remember, "the key, the whole key and nothing but the key, so help me codd"! :)

CraigTP
+6  A: 

Have a look at these related earlier SO questions:

Ash
Thanks, I'm thinking in delete this post.
Click Ok
No need to delete it. The links in the answer are useful.
Atømix