views:

217

answers:

7

is this javascript function (checkValidity) correct?

function checkTextBox(textBox)
{
   if (!checkValidity(textBox.getValue()))
       displayError("Error title", "Error message", textBox);
       textBox.focus();
}

function checkValidity(e) 
{
    var email;
    email = "/^[^@]+@[^@]+.[a-z]{2,}$/i";

    if (!e.match(email)){
            return false;
    else
            return true;
    }
}

EDIT: All the answers appreciated! Thanks!

+1  A: 

No. It assumes that an email address can contain only one @. I would suggest reading this article.

You probably also meant \. instead of ..

Mark Byers
Addresses can't contain more than one @ unless they're using an obsolete form.
Thom Smith
@Thom Not true. Read the RFC.
Josh Stodola
Actually, the RFC allows @ inside a quoted-string which can come before the @ that separates the local-part from the domain. So "party@dougs"@example.com is a valid e-mail address.
MtnViewMark
+1  A: 

Try this: I am sure it takes care of all kinds of email validation.

function checkEmail(email)
{   
    if(/^([a-z]([a-z]|\d|\+|-|\.)*):(\/\/(((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:)*@)?((\[(|(v[\da-f]{1,}\.(([a-z]|\d|-|\.|_|~)|[!\$&'\(\)\*\+,;=]|:)+))\])|((\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5]))|(([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=])*)(:\d*)?)(\/(([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)*)*|(\/((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)+(\/(([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)*)*)?)|((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)+(\/(([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)*)*)|((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)){0})(\?((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)|[\uE000-\uF8FF]|\/|\?)*)?(\#((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|@)|\/|\?)*)?$/i.test(email)) {
      return true;
    } else {
      return false;
    }
}

HTH

Raja
I think that you can safely ignore addresses in certain deprecated formats that haven't been used in decades.
Thom Smith
RFC 5322 § 3.4.1 defines the address specification for e-mail. It doesn't allow the range of characters listed here. Specifically, it doesn't support non-ASCII characters.
MtnViewMark
+3  A: 

Nope, that regex is not that fit for the purpose. Have a look at this instead (though I can't guarantee it's validity).
Also, regarding the script itself, why are you not checking like this:

function checkEmailValidity(e) {
  return e.match("some validating regex");
}

It seems like a faster, more consise, and more readable solution.

EDIT:
It's worth noting, that it's almost impossible to write a regex that can detect any valid email address. Therefore, you might be better off with trying to make some validation algorithm, instead of a regex, as valid email adresses may be very, very complex.

Consider this code:

function validateEmail(email) {
    if (typeof email != "string") return false;
    email = email.split("@");
    email[1] = email[1].split(".");
    if (
        email.length !== 2 &&
        email[1].length < 2 &&
        !email[1].hasValues(String)
    ) return false;
    return true;
}

// Checks whether each key in an array has a value, and (optionally) if they match a given contructor (object type).
// I.e.: myArray.hasValues(String) // checks whether all elements in myArray has a value, a nonempty string.
Array.prototype.hasValues = function(assumedConstructor) {
    for (var i = 0; i < this.length; i++) {
        if (!this[i]) return false;
        if (assumedConstructor && this[i].constructor != assumedConstructor) return false;
    }
    return true;
};

It works this way:

  1. First checking if the string contains one @, and only one
  2. Checks that the part after the @ has at least one .
  3. Checks that there is some characters between each of the possible .'s.

It will still be easy to forge a fake email address, but this way we ensure that it's at least somehow properly formatted. The only gotcha that I can think of, is @'s inside comments, which should be perfectly legal according to the RFC's, but this code will treat it as an error.
Normal internet users, which have normal email adresses, will not fail this. So how much it actually matters is up to you to decide ;)

The best solution, if one is available, is to put the address into some built-in method, that somehow checks for validity by trying to use the email address.

Sune Rasmussen
The regex in the referenced article is wrong - and the author simply claims his definition of e-mail addresses is fine - only it is defined by RFC, and he is wrong.
MtnViewMark
I'm very sorry for the reference. I normally don't do regexes, so I wouldn't know.
Sune Rasmussen
+1  A: 
function isValidEmail($email){
    return eregi("^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$", $email);
};

if(isValidEmail([email protected])) { echo "valid"; } else { echo "aaa"; };

cosy
+4  A: 

E-mail address are defined in RFC 5322, § 3.4. The relevant non-terminal is addr-spec. The definition turns out to be somewhat squirelly, due to both the complications of domain specifications and supporting old, obsolete forms. However, you can do an over-approximation for most forms with:

^[-0-9A-Za-z!#$%&'*+/=?^_`{|}~.]+@[-0-9A-Za-z!#$%&'*+/=?^_`{|}~.]+

Notice that there are a very large number of legal characters. Most reg-exs get the list wrong. Yes, all those characters are legal in an e-mail address.

This regex will not match some very uncommon used forms like "noodle soup @ 9"@[what the.example.com] -- which is a legal e-mail address!

MtnViewMark
+1  A: 

Here's a regex corresponding to the RFC, also excluding the obsolete forms. I've broken it up into components so that it will be easy to read:

IDENT = [a-z0-9](?:[a-z0-9-]*[a-z0-9])?
TEXT = [a-z0-9!#$%&'*+/=?^_`{|}~-]+

EMAIL = TEXT(?:\.TEXT)*@IDENT(?:\.IDENT)+

(Note: case insensitive.)

This won't match email addresses that use the quoted form before the @ or the bracketed form after it, but while those forms are valid, they're hardly ever seen nowadays outside of examples, and they significantly complicate the regex.

Thom Smith
+1  A: 

Validating an email address is very difficult. It's not even worth validating on the client-side, other than very basic checking for @ and . characters.

Section 3.4.1 of RFC 5322 elaborates on the immense variety of legal characters, and you'll see that creating a bullet-proof regex is going to be nearly impossible.

I ended up giving up on validating because I would get occasional complaints from users saying their crazy email address does not work. So, from now on I just attempt to send the email and hope it gets delivered. If it fails to send, then I tell the user their e-mail address is problematic.

Josh Stodola