tags:

views:

1764

answers:

5

Does anyone know a good .NET dictionary API? I'm not interested in meanings, rather I need to be able query words in a number of different ways - return words of x length, return partial matches and so on...

Thanks

+2  A: 

You might want to look for a Trie implementation. That will certainly help with "words starting with XYZ" as well as exact matches. You may well want to have all of your data in multiple data structures, each one tuned for the particular task - e.g. one for anagrams, one for "by length" etc. Natural language dictionaries are relatively small compared with RAM these days, so if you really want speedy lookup, that's probably the way to go.

Jon Skeet
+8  A: 

Grab the flat text file from an open source spellchecker like ASpell (http://aspell.net/) and load it into a List or whatever structure you like.

for example,

List<string> words = System.IO.File.ReadAllText("MyWords.txt").Split(new string[]{Environment.NewLine}).ToList();

// C# 3.0 (LINQ) example:

    // get all words of length 5:
    from word in words where word.length==5 select word

    // get partial matches on "foo"
    from word in words where word.Contains("foo") select word

// C# 2.0 example:

    // get all words of length 5:
    words.FindAll(delegate(string s) { return s.Length == 5; });

    // get partial matches on "foo"
    words.FindAll(delegate(string s) { return s.Contains("foo"); });
Barry Fandango
I believe that code requires C# 3.0, and either .NET 3.5 or .NET 2.0 with LINQBridge. .NET 3.0 doesn't provide anything useful over .NET 2.0 in this respect.
Jon Skeet
True, I've been working in 3.0 for a while now, so I guess I'm getting pretty used to having LINQ handy when I need it. Edited to contain non-3.0 samples.
Barry Fandango
Why can I only upvote once? ;)
GalacticCowboy
+1  A: 

Depending on how involved your queries are going to be, it might be worth investigating WordNet, which is basically a semantic dictionary. It includes parts of speech, synonyms, and other types of relationships between the words.

rmeador
+1  A: 

NetSpell (http://www.loresoft.com/netspell/) is a spell checker that's written in .NET that has word listings in several languages that you could use.

Mike Hall
+2  A: 

I'm with Barry Fandango on this one, but you can do it without LINQ. .NET 2.0 has some nice filtering methods on the List(T) type. The one I suggest is

List(T).FindAll(Predicate(T)) : List(T)

This method will put every element in the list through the predicate method and return the list of words that return 'true'. So, load your words as suggested from an open source dictionary into a List(String). To find all words of length 5...

List(String) words = LoadFromDictionary();
List(String) fiveLetterWords = words.FindAll(delegate(String word)
    {
        return word.Length == 5;
    });

Or for all words starting with 'abc'...

List(String) words = LoadFromDictionary();
List(String) abcWords = words.FindAll(delegate(String word)
    {
        return word.StartsWith('abc');
    });
Anthony Mastrean
oh snap, he wrote the C# 2.0 code too... oooops. (note to self: read answers fully)
Anthony Mastrean