Hello,
me and my colleagues are developing a website with similar idea as Stackoverflow, but for submitting tasks (and for internal use). This morning we talked about tagging tasks and couldn't really figure which option would be the fastest one, or if we aren't missing something.
Let's imagine table with tags, which would dynamically update, depending on users. Users can create any tags and they'd be added to this table. Structure is following:
- ID
- name
- count
I'll get to actual point now. If you click on, for example, tag "PHP", it'd show you another page with all tasks tagged with "PHP". Something similar to this page. Important thing is this list of related tags. How to represent it in database?
Two options came to our minds, but I don't think any of them is actually the most effective one.
Make select of all tasks with "PHP" tag and check what other tags they contain. In few years we might get an answer from server.
Make a table with cols tag, related tag, count where would be all possible tag relationships. Only problem we see is duplicity. We could have tag PHP and related tag DB2, but we could also have tag DB2 with related tag PHP, which is of course the very same relationship, with the very same count.
I actually quite like option #2, but without duplicity. Perhaps option where there wouldn't be so close relationship between tags (as if there weren't any "primary" and "secondary" tags) could work the best. I'm not really very sure at this point and I wouldn't like to model something that wouldn't work in the future or would be way too slow if there were, for example, one million tags.
We will use PHP and mySQL or DB2, but I guess that doesn't matter.
So, the actual questions is: Are there any other, possibly better options? In case of any questions, just ask me.
Thanks in advance.