views:

564

answers:

5

I can work out how to create anagrams of a string but I don't know how I can compare them to a dictionary of real words to check if the anagram is a real word. Is there a class in the Java API that contains the entire English dictionary?

+1  A: 

No, you have to use an external library, such as JWNL, which is a wrapper for WordNet -- a machine-readable lexical database organized by meanings, that contains pretty much every English word.

JG
I'm not familiar with JWNL, but wouldn't the "meaning" data add unnecessary bulk to an application that really only needs a simple word list?
Carl Smotricz
+1  A: 

Maybe the English dictionary in jazzy can help you.

Yuval F
+1  A: 

There's no such specialized class in the standard Java library, but you can use any implementation you like of the Set interface and initialize it by loading it up with words of your choosing, picked from any of the innumerable word lists you can find in many places (just check out carefully that the license for the word list you choose is compatible with your intended application, e.g., does it allow commercial use, closed-source apps if that's what you require, and so forth).

Alex Martelli
+3  A: 

No, but you can get a wordlist from various places. From there, you could read the wordlist file into a list:

List<String> lines = new ArrayList<String>();
BufferedReader in = new BufferedReader(new FileReader("wordlist.txt"));
String line = null;
while (null!=(line=in.readLine()))
{
   lines.add(line);
}
in.close();

And finally binary search use lines.contains() for your candidate word.

sblom
No reason to use `O(log N)` binary search when `Set` implementations such as `HashSet` can be so much faster.
Alex Martelli
Super good point Alex Martelli. I have no idea what I was thinking. Need coffee.
sblom
A: 

One method of determining whether a set of characters is an anagram of a word involves using prime numbers. Assign each letter a prime number, for example, a=2, b=3, c=5, d=7. Now precompute the product of primes for each word in your dictionary. For example, 'add' = 2*7*7 = 98, or 'bad' = 3*2*7 = 42.

Now determining if a set of letters is an anagram of any word in a dictionary can be done by computing the value of the set of letters. For example, the letters 'abd'= 2*3*7 = 42 = 'bad'. Just check whether the computed value for the letters exists in your precomputed dictionary. For any anagram, you need only do this computation once versus trying to generate every possible anagram. Note however this method will only work well for relatively small words, otherwise you will run into overflow issues and need to use BigInteger.

Andrew