views:

59

answers:

3

I have to store a set of related keywords inside a database. As of now, I am thinking of using the following:

To store the keywords themselves:

CREATE TABLE keywords(
   id int(11) AUTO_INCREMENT PRIMARY KEY,
   word VARCHAR(255)
);

To store the relations (stores the ids of the related keywords):

CREATE TABLE relatedkeywords(
   id int(11) AUTO_INCREMENT PRIMARY KEY,
   keyword1 int(11),
   keyword2 int(11),
   FOREIGN KEY (keyword1) REFERENCES keywords(id),
   FOREIGN KEY (keyword2) REFERENCES keywords(id)
);

Is this the convention or is there a better way of doing this? The only problem I am seeing is that I need to check both the column in order to be able to get the related keywords sometimes... I might be missing something here.

+3  A: 

If "relatedness" is a property of a pair of keywords, this schema is OK (don't forget to add UNIQUE(keyword1, keyword2))

If "relatedness" can spread a set of keywords and a set of related keywords may have additional propertirs, you may want to add a new table "Related_Set" and a M:N relationship "Keyword_Set" between keywords and sets.

If a set doesn't have any additional properties, you may just live with "Keyword_Set" table

Dmitry
Yea, the UNIQUE is important, good point! +1
o.k.w
I would define the primary key as being *both* the `keyword1` and `keyword2` columns (composite). It's unlikely to be searching or using the `id` column - this is info you'd never display to the user nor let them reference.
OMG Ponies
Thanks. Most probably I won't be using related sets so the simple one should suffice for now. Thank You.
Legend
+2  A: 

Simplify the second table to:

CREATE TABLE relatedkeywords(
   keyword1 int(11),
   keyword2 int(11),
   FOREIGN KEY (keyword1) REFERENCES keywords(id),
   FOREIGN KEY (keyword2) REFERENCES keywords(id),
   PRIMARY KEY (keyword1, keyword2)
)

as this is one of the cases where an "artificial primary key" just makes little sense and offers no practical usefulness.

Alex Martelli
Thanks. I'll modify that one.
Legend
A: 

Is there a solution with just one table -

create table keywords (
   keywrd varchar (40) not null primary key,
   related_keys_csv varchar(400)  
)
blispr
I think the only problem with this is the case where related keys exceeds varchar(400). I could be mistaken though. And one more thing is that, in the two table approach, I would search two columns to get related keyword of any keyword. But here, I will have to search a large chunk of text to get other related keywords. I guess it just boils down to efficiency in the end :)
Legend
If each keyword has (lets say on average) 6 related keywords, we can hit all the 6 keywords in one select statement - it avoids the need to perform 6 selects. Its more of a dictionary-data structure (python-speak) where each name (your original keyword) has an associated collection of related keywords... Hence its a collection concept - not a set that has just one element.
blispr