views:

303

answers:

5

How can I take an input word (or sequence of letters) and output a word from a dictionary that contains exactly those letters?

Does java has an English dictionary class (list of words) that I can use, or are there open source implementations of this?

How can I optimize my code if this needs to be done repeatedly?

+4  A: 

Two words are said to be anagrams if they have the exact same letters, exact same number of times.

The check for anagram is to sort the letters of both the words and check for equality:

sort_letters(word1) == sort_letters(word2)

Now to find all the anagrams of a given dictionary word say word1, I would find all the words in the dictionary for which the above test holds. To optimize the search we can just search for words that are of same length.

If we have to do this repeatedly its better to do some preprocessing. We can build something like a HashMap where in we would map a string to a set of strings which are anagrams. Something like:

Bad ==> Dab
Cat ==> Act, Tac
.....

Now given any word I can look into the hashMap to get all its anagrams.

codaddict
+13  A: 

Convert your dictionary into an anagram dictionary. In an anagram dictionary, the words are indexed by their letters in sorted alphabetical order. To look up anagrams for a certain word, you sort its letters and look up corresponding ones from the anagram dictionary.

Matti Virkkunen
A: 

You can use Anagrams2 example from Sun site as a starting point

For improved performance, you can have a cache of anagrams for frequently used/recently used words.Consider using WeakHashMap for this purpose

diy
A: 

As unicornaddict mentioned, you can fairly easily determine whether or not two words are anagrams by sorting, however this is inefficient, especially if you are doing it repeatedly.

A prepared hash-table would probably be the best solution, by loading up your dictionary into it at the beginning of the program. A fairly easy-to-write algorithm for hashing/comparing would be

uint HashSomeWord(string someWord)
{
   uint hashVal = 0;
   //foreach letter in someword
   {
      //hashVal += letter.ValueAsInteger
   }
   return hashVal;
}

then

bool IsAnagram(string inputWord, string compareTo)
{
    if(inputWord == null
       || compareTo == null
       || inputWord.Length != compareTo.Length
       || HashSomeWord(inputWord) != HashSomeSome(compareTo))
    {
       return false;
    }
    if(sort_letters(inputWord) == sort_letters(compareTo))
    {
        return true;
    }
}

My Java is pretty rusty, but I think that would do it.

A: 

From my POV, the key to this assignment is to find a function (hashFunc) that maps strings to numbers so that 1) two anagrams are mapped to the same number, 2) two non-anagrams are mapped to different numbers. Once the function is found, it can be simply applied to inputs thus avoiding tedious string comparisons:

   if(hashFunc(word1) == hashFunc(word2)) -> word2 is anagram of word1     

Does java has an English dictionary class (list of words) that I can use, or are there open source implementations of this?

On unix systems, you can start with the words file

How can I optimize my code if this needs to be done repeatedly?

Turn the dictionary into a hash table using precalculated hashFunc.

stereofrog