views:

70

answers:

5

I just developed a simple asp.net mvc application project for English only. I want to block user's any input for a language other than English. Is it possible to know whether user inputs other languages when they write something on textbox or editor in order to give a popup message?

A: 

Checking for Latin 26

If you wanted to ensure that any non-English letters were submitted, you could simply validate that they fall outside the A-Z, a-z, 0-9 and normal punctuation ranges. It sounds like you want the regular non-Latin characters to be detected and rejected.

Detecting the user's OS settings, keyboard settings isn't the best way, as the user could have multiple keyboards attached, and have use of copy/paste.

UI Validation

At the user interface level, you could create a jQuery method that would check the value of a textbox for a value other than your acceptable range. Perhaps that's A-Z, a-z and numeric. You could do this on event onBlur. Remember that you might want to allow ', .

$('#customerName').blur(function() {
    var isAlphaNumeric;
    //implementation of checking a-z, A-Z, 0-9, etc.
    alert(isAlphaNumeric);
});

Controller Validation

If you wanted to ALSO implement this at the controller level, you could run a regex on the incoming values.

public ActionMethod CreateCustomer(string custName)
{
    if (IsAcceptableRange(custName))
    { 
       //continue
    }
}

public bool IsAcceptableRange(string input)
{
   //whitelist all the valid inputs here. be sure to include 
   //space, period, apostrophe, hypen, etc
   Regex alphaNumericPattern=new Regex("[^a-zA-Z0-9]");
   return !alphaNumericPattern.IsMatch(input); 
}
p.campbell
It will not identify the language, just that you are entering characters in the English alphabet, which is shared by many other languages as well.
Mikael Svenson
@Mikael: agreed. Sven, Gilles, etc would pass this test. It would only filter out non-Latin 26 characters.
p.campbell
that means I can't enter punctuation? or even space?
Lie Ryan
@Lie: indeed you'll have to include those characters as part of your regex and validation.
p.campbell
+4  A: 

You could limit the input box to latin characters, but there's no automatic way to see if the user entered something in say English, Finnish or Norwegian. They all mostly use a-z. Any character outside of a-z could give you an indication, but certain accents needs to be allowed in English as well, so it's not 100%.

Google Translate exposes a javascript API to detect the language of text.

Mikael Svenson
A: 

there are two tests you can do. one is to find out what the cultureinfo is set on the users machine:

http://msdn.microsoft.com/en-us/library/system.threading.thread.currentuiculture.aspx

this will give you their current culture setting, which is a start. of course, you can have your setting as 'english' but still typing in russian, and most of the letters will be the same..

so the next step is to discover the language using this: http://www.google.com/uds/samples/language/detect.html

it's not the greatest, according to online discussions, but its a place to start. I'm sure there are better natural language identifiers out there, though.

Oren Mazor
+2  A: 

Use the following code:

<p>Note that this community uses the English language exclusively, so please be
considerate and write your posts in English. Thank you!</p>
Matti Virkkunen
I agree with you. This is first thing that I am going to do, but the truth is unfortunately user never give attention to read that kind of warning.
Hoorayo
@Hoorayo: StackOverflow does not have such a warning (or if it does, it never caught my attention), but most if not all questions and answers are in English. So, it can work without. You just need to figure out how. :-)
dtb
A: 

Google Translate was quoted in two answers, but I want to add that Microsoft Word API may also be used to detect language, just like Word does for check spelling.

It is for sure not the best solution, since language detection by Microsoft Office doesn't work very well (IMHO), but may be an alternative if doing web requests to Google or other remote service on every posted message is not a solution.

Also, check spelling through Microsoft Word API can be useful too. If a message has a huge number of misspelled words when checking in English, it's probably because the message is written in another language (or the author of the message writes too badly, too).

Finally, I completely agree with Matti Virkkunen. The best, and maybe the only way to ensure that messages will be written in English is to ask the users to write in English. Otherwise, it's just as bad as implementing obscenity filters.

MainMa