tags:

views:

129

answers:

4

Looking at the posts here for email address validation, I am looking to be much more liberal about the client side test I am performing.

The closest I have seen so far is:

^([\w-\.]+)@((\[[0–9]{1,3}\.[0–9]{1,3}\.[0–9]{1,3}\.)|(([\w-]+\.)+))
([a-zA-Z]{2,4}|[0–9]{1,3})(\]?)$

That will not match this#[email protected], which according to RFC is valid

  • Uppercase and lowercase English letters (a-z, A-Z)
  • Digits 0 through 9
  • Characters ! # $ % & ' * + - / = ? ^ _ ` { | } ~
  • Character . (dot, period, full stop) provided that it is not the first or last character, and provided also that it does not appear two or more times consecutively.

I want a pretty simple match:

  • Does not start with .
  • Any character allowed up to the @
  • Any character allowed after the @
  • No consecutive . or @ allowed
  • Part after the last . (tld) must be [a-z0-9-]

I will use \i to make the search case insensitive. The consecutive characters is where I am getting hung up on.

+1  A: 
/^[^.].*@(?:[-a-z0-9]+\.)+[-a-z0-9]+$/
Matthew Scharley
Seems near perfect, one exception:<pre>[email protected]@[email protected]@example..comt#[email protected]#est@exa#mple.com <-this matchest#est@exa#mple.c#om</pre>Anything after the last @ should be only [a-z0-9-] (valid domain chars) then a dot, then another [a-z0-9-]
No idea how to reformat that comment, sorry about that.
In your question, you said only the TLD name should be [-a-z0-9]. Fixing that is trivial.
Matthew Scharley
+1  A: 

If you want to match against the official standard, you can use

(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])

So even when following official standards, there are still trade-offs to be made. Don't blindly copy regular expressions from online libraries or discussion forums. Always test them on your own data and with your own applications.

voyager
+1  A: 
function validator(email) {
   var bademail = false;
   bademail = (email.indexOf(".") == 0) ? true : bademail;
   bademail = (email.indexOf("..") != -1) ? true : bademail;
   bademail = (email.indexOf("@@") != -1) ? true : bademail;
   if(!bademail) {
      var tldTest = new RegExp("[a-z0-9-]");
      var lastperiodpos = email.lastIndexOf(".");
      var tldstr = email.slice(lastperiodpos + 1);
      bademail = (!(tldTest.test(tldstr))) ? true : bademail;
      } 
   }
   return bademail;
}
Anthony
+1 because some boor gave you a -1 without leaving a comment. I hate that!
TrueWill
Thanks. I just figured I'd actually keep it simple, as requested, rather than involve regex where it's not needed. Wish I could have thought of a way to use it at the end that wasn't convoluted.
Anthony
A: 

It depends on who is using your applications. For internal applications, often a username is a valid email address. Much of the RFC-822 email spec describes additional fields which may be present in an email address. For example, Allen Town [email protected], is a pretty standard email address which you might type into your favorite mail client. However, for an application, you may want to be the one adding the name to the email address when you send email, and don't want that to be part of the users address.

The most liberal way of validating an email address is to just attempt to send an email to whatever address the user gives. If they receive the email, and can confirm it, then it's a valid address.

brianegge
I understand this, but I would like some up front validation. Just so the aol users can not make mistakes :) There will be no local delivery, so the email must be in the format of [email protected]