views:

88

answers:

4

In a textbox in the application, I need to validate to ensure that a user enters only English language text. I know some languages such as Spanish share English's alphabets. How do I validate text to make sure it's:

  • Only in English language
  • Supports only languages that use the English character set (Spanish etc)

Thanks

EDIT: Sorry for not being clear enough. This app is on production and when I check the SQL database where the text is stored, there are a lot of rows with "??? ?????". On further investigation, it appears that this is caused when a non english language text is saved to a database. As an example, go to google news, select google Korea from the dropdown, copy some Korean text and save it to a SQL server database

Anyone?

A: 

One way is to use a English Language Dictionary / Spell Checker , if is valid English / Spanish Word

a very good sample is this

It is as simple as follows

NetSpell.SpellChecker.Spelling SpellChecker = 
             new NetSpell.SpellChecker.Spelling SpellChecker()

SpellChecker.Text = MyTextBox.Text;
SpellChecker.SpellCheck();

NetSpell Home Page: http://www.loresoft.com/NetSpell

Asad Butt
A: 

You can try to check against an English dictionary (e.g. OpenOffice has a dictionary which you may use for free, not sure about that though) if most of the used words are recognized by this dictionary.

You could also do some kind of text analysis and check the occurance of each character or short sequence like 'th' etc. Each language has specific character occurances and this could help you determining in what language the text is written.

I would not prohibit certain characters because at least in names special characters occur quite often.

I hope you got an idea of some possibilities.
Best Regards, Oliver Hanappi

Oliver Hanappi
A: 

If this is for a moderately small amount of text, you could try finding an English dictionary web service and try to look up the words. If lookup fails, you most likely either have a typo or something from another language. I haven't found one that accepts large blocks of text, but there is a web service that operates off of the dict.org database:

DictService

Mike
A: 

By "English character set", I guess you are referring to the ASCII character set.

You can iterate through each character and see whether it lies in the ASCII range.

Subbu
Yes, you are correct. Any idea how I would validate to make sure that each character is within the ASCII range? Is there a way to validate the entire text block instead of doing it char by char?
I am not aware of any way of doing this for a whole text block.
Subbu