views:

39

answers:

2

I’m about to create a "smart" dictionary that could generate similar words if the word from the user is not in the dictionary.

The dictionary starts with reading a file with words, the word should be added to the binary tree and a hash table. The hash table is used to determine if the word or similar word is in the dictionary, the hash Table will have a Boolean effect so we can fast look if the binary search tree contains the word. The hash Table has to be around ten times the length of our dictionary, because we also include similar words to the hash Table. As relatively new to Java, I would like tips and suggestions for how to make a hash function that would be ideal for my situation.

public String [] similarOne(String word) {

    char [] word_array = word.toCharArray();
    char [] tmp;

    String [] words = new String[word_array.length-1];

    for(int i = 0; i < word_array.length - 1; i++) {
        tmp = word_array.clone();
        words[i] = swap(i, i+1, tmp);
    }
    return words;
}

public String swap(int a, int b, char [] word) {
    char tmp = word[a];
    word[a] = word[b];
    word[b] = tmp;

    return new String(word);
}
A: 

Google for 'java metaphone' and 'java soundex'.

You could try using the results of a Metaphone encoding as the hash key, for example.

Tony Ennis
A: 

I suggest that you should use a Trie or patricia-trie.I don't know what you mean by similar words.But i'm guessing it is something like google suggest.I have wrote a small program previously which does auto-complete.It has a dependency with patricia-trie so will have to include it.You can use it as a reference.

Emil
check this link too.http://rmandvikar.blogspot.com/2008/10/trie-examples.html
Emil
http://sujitpal.blogspot.com/2007/02/three-autocomplete-implementations.html
Emil