views:

173

answers:

4

Hello,

Am trying to find some text only if it contains english letters and numbers using Javascript/jQuery.

Am wondering what is the most efficient way to do this? Since there could be thousands of words, it should be as fast as possible and I don't want to use regex.

 var names[0] = 'test';
 var names[1] = 'हिन';
 var names[2] = 'لعربية';

 for (i=0;i<names.length;i++) {
    if (names[i] == ENGLISHMATCHCODEHERE) {
        // do something here
    }
 }

Thank you for your time.

+3  A: 

A regular expression for this might be:

var english = /^[A-Za-z0-9]*$/;

Now, I don't know whether you'll want to include spaces and stuff like that; the regular expression could be expanded. You'd use it like this:

if (english.test(names[i])) // ...

Also see this: http://stackoverflow.com/questions/150033/regular-expression-to-match-non-english-characters

edit my brain filtered out the "I don't want to use a regex" because it failed the "isSilly()" test. You could always check the character code of each letter in the word, but that's going to be slower (maybe much slower) than letting the regex matcher work. The built-in regular expression engine is really fast.

When you're worried about performance, always do some simple tests first before making assumptions about the technology (unless you've got intimate knowledge of the technology already).

Pointy
He did say he didn't want to use regexes. *Why* he doesn't, I don't know, but he did say...
T.J. Crowder
Yes, spaces and special characters are fine too. Basically I plan to truncate the words, but when it is say हिन and I truncate it, it does not appear fine.
Alec Smart
FYI, the given regexp will allow empty strings too.
Raveren
+1  A: 

If you're dead set against using regexes, you could do something like this:

// Whatever valid characters you want here
var ENGLISH = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";

function stringIsEnglish(str) {
    var index;

    for (index = str.length - 1; index >= 0; --index) {
        if (ENGLISH.indexOf(str.substring(index, index + 1)) < 0) {
            return false;
        }
    }
    return true;
}

...but a regex would almost certainly be orders of magnitude faster.

T.J. Crowder
+1  A: 

Using regex is the fastest way to do this I'm afraid. This to my knowledge should be the fastest algorithm:

var names = 'test',
var names[1] = 'हिन';
var names[2] = 'لعربية';

//algorithm follows
var r = /^[a-zA-Z0-9]+$/,
    i = names.length;

while (--i) {
    if (r.test(names[i])) {
        // do something here
    }
}
Raveren
A: 

You should consider words that may contain special characters. For example {it's}, isn't it english?

mhd196