views:

74

answers:

2

I'm looking for a way to find common mis-spelling of strings when entered by the keyboard. For example, I would like "house" to return "hoise", "hpuse", "jouse", etc. because the misspelled characters are close to the correct ones on a QWERTY keyboard.

If i could get this to work with numbers only that would still be a big help. Given "101", return "111", "11", "01", "10", etc. It doesn't have to be perfect, just return some common typos.

Does anyone know of an existing method to accomplish this or perhaps a suggestion on how I might write one myself?

A: 

SpellCheck.NET seems to be a good choice.

http://www.codeproject.com/KB/recipes/spellcheckparser.aspx

Marcus Johansson
The misspelled feature could be helpful. I'll look at it more if there aren't any better recommendations.
MAW74656
+1  A: 

The algorithm iteself is not that complicated - you need a good dictionary to compare against.

Read this SO question for more details.

Here is the algorithm itself is 21 lines of python, and here a C# implementation.

Oded
Wow, I guess I'm trying to fight way out of my weight-class here. Can you give a simple call example? Also, I don't see a "dictionary" as you mentioned.
MAW74656
@Marc - look at the source code: `File.ReadAllText("big.txt")` is where the dictionary ("big.txt") is loaded. The program itself is a command line app, so you can simply run it on the command line.
Oded
Ok, I see. I was hoping to integrate the feature into my application, where user can select a job number and program will return possible misstypes for that number (which is actually just a string composed mostly of numbers, but may contain a letter at the end.).
MAW74656
Also, big.txt. Is this simply a giant list of words (like in a dictionary)? Does it need to be in a certain format (Comma separated, tab delimited, etc)?
MAW74656
@MAW74656 - I believe it is one word per line.
Oded
Wait a sec, could I just put possible job number permutations in the file? For example, this year they can only be 10xxxx[FRT]. I bet I could find those permutations easily enough. I don't even need a "words" dictionary. Or am I missing something?
MAW74656
@MAW74656 - Looks like a good plan. That _should_ work.
Oded
@Oded - One last question: What is a StrEnum and how can I use it? I'm used to Lists, are they similiar? Can I foreach it?
MAW74656
@MAW74656 - It is an alias to `IEnumerable<string>` - see line 5 in the C# code (`using StrEnum = System.Collections.Generic.IEnumerable<string>;`).
Oded
Ahhh, got it. I see that I can use foreach to iterate through the collection.
MAW74656