views:

822

answers:

5

I know there are a lot of questions on here about email validation and specific RegEx's. I'd like to know what the best practices are for validating emails with respect to having the [email protected] trick (details here). My current RegExp for JavaScript validation is as follows, but it doesn't support the extra + in the handle:

/^([a-zA-Z0-9_.-])+@(([a-zA-Z0-9-])+.)+([a-zA-Z0-9]{2,4})+$/

Are there any other services that support the extra +? Should I allow a + in the address or should I alter the RegEx to only allow it for an email with gmail.com or googlemail.com as the domain? If so, what would be the altered RegEx?

UPDATE: Thanks to everyone for pointing out that + is valid per the spec. I didn't know that and now do for the future. For those of you saying that its bad to even use a RegEx to validate it, my reason is completely based on a creative design I'm building to. Our client's design places a green check or a red X next to the email address input on blur of it. That icon indicates whether or not its a valid email address so I must use some JS to validate it then.

+8  A: 

+ is a valid character in an email address. It doesn't matter if the domain isn't gmail.com or googlemail.com

Regexes aren't actually a very good way of validating emails, but if you just want to modify your regex to handle the plus, change it to the following:

/^([a-zA-Z0-9_.-\+])+@(([a-zA-Z0-9-])+.)+([a-zA-Z0-9]{2,4})+$/

As an example of how this regex doesn't validate against the spec: The email [email protected] is valid according to it.

Ben S
I would be curious as to what is the best to handle email address validation.
Zoidberg
I suppose he means it's better to use a proven, tested library rather than rolling your own regex. A Java example of such a library is Apache Commons.
Don
See http://stackoverflow.com/questions/3232/how-far-should-one-take-e-mail-address-validation
Ben S
Does he mean the format of the email? Or the validity of the e-mail?
Zoidberg
@ Ben S Thanks for the updated RegExp. I had to add another `\` before your added `\+` for it to work. I wasn't aware of `+` being part of the spec so thanks for point it out. I now know for the future.
Mark Ursino
I'd go further than "aren't a very good way": it's *mathematically impossible* to validate (all possible) email addresses with regexes. The grammar defining the format of email addresses is a Type 2 Chomsky Grammar, and regexes are only able to deal with Type 3 Chomsky Grammars: http://en.wikipedia.org/wiki/Chomsky_grammar#The_hierarchy
NickFitz
+4  A: 

If you need to validate emails via regexp, then read the standard or at least this article.

The standard suggests to use this regexp:

(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])

If that doesn't scare you, it should :)

Aaron Digulla
From the article you link to: "Don't blindly copy regular expressions from online libraries or discussion forums." Also, this regex _still_ doesn't fully validate to the spec.
Ben S
@Aaron Thanks, that's very scary. I'm going to pretend I've never seen that before -- especially the encoded characters!
Mark Ursino
+3  A: 

I would tend to go with something along the lines of /.+@.+\..+/ to check for simple mistakes. Then I would send an email to the address to verify that it actually exists, since most typos will still result in syntactically valid email addresses.

tloach
+1  A: 

The specs allow for some really crazy ugly email addresses. I'm often very annoyed by websites even complaining about perfectly normal, valid email addresses, so please, try not to reject valid email addresses. It's better to accept some illegal addresses than to reject legal ones.

Like others have suggested, I'd go with using a simple regexp like /.+@.+..+/ and then sending a verification email. If it's important enough to validate, it's important enough to verify, because a legal email address can still belong to someone other than your visitor. Or contain an unintended but fatal typo.

mcv
There is no reason why the government of Tonga can't add an `MX` entry to the `to` ccTLD, making an address such as `mcv@to` an actual, working e-mail address. They already have a webserver running on `http://to` (for a URI shortener service), so it's certainly not unrealistic.
Jörg W Mittag
A: 

A very good article about this subject I Knew How To Validate An Email Address Until I Read The RFC

Moshe