views:

76

answers:

1

I've got a list of synonyms and need to create a database in SQL for it.

I was thinking about using a Relational Database Design, but don't know if it would be the best. There will be a decent amount of traffic using this database.

I was thinking about Table1 would be like

Id

Table2
Id
InterlinkID (Table1 Id)
Word

Would this be the best way? There could be 1 - 20+ linked words. One other problem I see from this setup is If I have 1 word that works as a synonym for more than one word.

Not so great Example of how it will be used, but you get the idea:


    Table 1
    Id 1 
    Id 2

    Table 2
    Id 1
    InterlinkID 1
    Word One
    Id 2
    InterlinkID 1
    Word 1
    Id 3
    InterlinkID 1
    Word First
    Id 4
    InterlinkID 2
    Word Two
    Id 5
    InterlinkID 2
    Word 2
    Id 6
    InterlinkID 2
    Word Second

+6  A: 

The most minimal way of modeling the relationship would be as a single table with three columns:

  • id - primary key, integer
  • word - unique word, should have a unique constraint to stop duplicates
  • parent_id - nullable

Use the parent_id to store the id number of the word you want to relate the current word to. IE:

id  |  word  |  parent_id
---------------------------
1   | abc    |  NULL
2   | def    |  1

...shows that abc was added first, and def is a synonym for it.

A more obvious and flexible means of modelling the relationship would be with two tables:

  1. WORDS

    • id, primary key
    • wordvalue
  2. SYNONYMS

    • word_id
    • synonym_id

Both columns in the SYNONYMS table would be the primary key, to ensure that there can't be duplicates. However it won't stop duplicates in reverse order. But it will allow you to map numerous combinations to have a "spider web" relationship between words, while the single table format would only support a hierarchical relationship.

OMG Ponies
+1, also, you might consider a check constraint (for the two table design) on SYNONYMS: `word_id<synonym_id`, this would prevent "reverse" duplicates: `word_id=12, synonym_id=45` and `word_id=45, synonym_id=12`, but it would depend on how you use the data, you might want it to actually have the "reverse" duplicates
KM
@KM: True, but it would also complicate relating a word with a higher id to an already existing one. Say "is" and "as" are synonyms, and "is" was put in first. You couldn't relate "as" to "is" after you add "as" - the check constraint would only allow you to relate "is" to "as". You'd have to display pk values for both words for users to know why they can do it one way but not the other.
OMG Ponies
Going to use Method #2 by OMG Ponies, THANKS FOR THE INFO!
Brad