ansaurus

Question

How to get synonyms ordered by their occurrence probability from Wordnet

Answer 1

A:

I think that you should do another step (provided that speed is not important).

From the Lucene index, you should build another dictionary in which each word is mapped to a small object that contains the only synonym that its meaning has higher probability of appearance, its meaning, and probability of appearance. I.e., given this code:

class Synonym {
public:
    String name;
    double probability;
    String meaning;
}

Map<String, Synonym> m = new HashMap<String, Synonym>();

... you just have to fill it from the Lucene index.

Baltasarq 2010-07-13 07:53:32

@Baltasarq, I understand the idea, like you said before, what I need is seems specific: I know that the querying online wordnet returns the synonims by their probability, but I do not understand how is this probability information stored inside this prolog database (which i converted into index with Syns2Index you have linked before) How to retrieve that probability(and is it there?) information and map it inside eg class you proposed?? Thanx!!

Julia 2010-07-14 05:46:03

Have you browsed this doc?http://wordnet.princeton.edu/wordnet/man/wnsearch.3WN.html

Baltasarq 2010-07-14 11:07:26

@Baltasarq: in case you will need it one day : http://lyle.smu.edu/~tspell/jaws/doc/edu/smu/tspell/wordnet/impl/file/ReferenceSynset.html#getTagCount%28java.lang.String%29

Julia 2010-07-27 21:02:40

Answer 2

+1 A:

In case someone stumbles upon this thread, this was the way to go(at least what i needed):

http://lyle.smu.edu/~tspell/jaws/doc/edu/smu/tspell/wordnet/impl/file/ReferenceSynset.html#getTagCount%28java.lang.String%29

tagCount method gives the most likely synset group for every word. The problem again is that synset with highes probability again can have several words. But i guess theres no chance to avoid this

Julia 2010-07-27 21:01:23

ansaurus

tags:

views:

answers:

How to get synonyms ordered by their occurrence probability from Wordnet

related questions