ansaurus

Question

PHP - Words in a db - search via lexical dictionary (semantic similarity)

Answer 1

A:

I'm unclear why you think Wordnet is inappropriate. I think what you're calling "postive/negative extremities" and "sister words" are what linguists call hypernyms (more general synonyms) and hyponyms (more specific synonyms). Wordnet includes a reasonably good model of these.

To use Wordnet, you'd find "sister" words by "going up" a few levels using the hypernyms('beer') relation. So if you started with "beer", going up 3 levels would give you "beverage". Then, you use the hyponyms('beverage') relation to "go down" several levels, to get types of beverages with the same amount of specificity as beer.

This is an example of Wordnet's interface as accessed through Nodebox Linguistics. I believe PHP has an equivalent Wordnet interface, although I've never used it.

>>> import en
>>> noun = 'beer'
>>> generalization_depth = 3
>>> sister_words = en.noun.hyponym(en.noun.hypernyms(noun)[generalization_depth][0])
>>> for word in reduce(lambda a,b: a+b, sister_words, []):
...     print word
... 
milk
wish-wash
potion
alcohol
alcoholic beverage
intoxicant
inebriant
hydromel
oenomel
near beer
ginger beer
mixer
cooler
refresher
smoothie
fizz
cider
cyder
cocoa
chocolate
hot chocolate
drinking chocolate
fruit juice
fruit crush
fruit drink
ade
mate
soft drink
coffee
java
tea
tea-like drink
drinking water

Chris S 2010-08-04 23:57:05

well, i guess it's also dependent on classification - for example, a rebel isn't necessarily "bad", but when it comes to murderer/criminal, there's the sense of something clearly negative. it's not specificality per se, but an actual degree of (in this case) "good person", "bad person" classification. in the milk/beer case... beer would be considered more negative/extreme than the others.

ina 2010-08-05 00:23:46

@ina, I see what you mean. Since that's a highly subjective criteria, I don't think you'll find any existing databases with "good/bad" classifications of words.

Chris S 2010-08-05 12:49:30

ansaurus

tags:

views:

answers:

PHP - Words in a db - search via lexical dictionary (semantic similarity)

related questions