I'm implementing a small dictionary database where I'd like to do searches based on lexical/semantic similarity between them..
For example, beer
has "sister words" such as soda, lemonade, wine, champagne
each "different" in a "different direction" (in example: the first two are "moderate" versions of the idea of "beer", while the latter two are "more extreme" versions)
I know WordNet has an API, but most of the words (and phrases) in my dictionary are related in more informal ways
(another example. "gangster" is related to [nun, orphan, rebel
] {criminal, mafia boss, murderer
}, where extremity varies from left to right, and the ones in [] are considered "positive extremities", and the ones in {} are "negative extremities")
In usage:
- User enters search input (a word)
- Word is matched with sister words.
- User has chance to "finetune word" by altering extremities in at least 2 directions, such as in examples above.
What's the best way to implement such a search -- steps 2 and 3 above?
I'm considering using PHP/MySQL since that is what I am familiar with, but what are better alternatives? Again - keep in mind that this isn't a large dictionary. It's just a selection of common words.
Here's my attempt at answering this - it's very, very basic... improvement suggestions welcome:
MySQL table words:
id, (primary key, autoincrement)
word (varchar 75),
relatedword (varchar 75)
relationscore (int 11)
direction (tinyint, -1 or 1)
Given a $word query and $direction:
"SELECT relatedword FROM words WHERE word='$word' AND direction=$direction ORDER BY relationscore DESC"