views:

257

answers:

5

Hi,

I am using the following regex to validate an email address:

"^[-a-zA-Z0-9][-.a-zA-Z0-9]*@[-.a-zA-Z0-9]+(\.[-.a-zA-Z0-9]+)*\.(com|edu|info|gov|int|mil|net|org|biz|name|museum|coop|aero|pro|[a-zA-Z]{2})$"

Unfortunately, this does not allow email addresses with hyphens underscores. Ex.:

[email protected]

How can I modify this to allow hyphens underscores?

A: 
"^[-_a-zA-Z0-9][-_.a-zA-Z0-9]*@[-_.a-zA-Z0-9]+(\.[_-.a-zA-Z0-9]+)*\.(com|edu|info|gov|int|mil|net|org|biz|name|museum|coop|aero|pro|[a-zA-Z]{2})$"

Possibly?

Meep3D
A: 
^[-a-zA-Z0-9_][-.a-zA-Z0-9_]*@[-.a-zA-Z0-9]+(\.[-.a-zA-Z0-9]+)*\.(com|edu|info|gov|int|mil|net|org|biz|name|museum|coop|aero|pro|[a-zA-Z]{2})$

I added "_" to your two character classes.

FrustratedWithFormsDesigner
+2  A: 

_ is not hyphen, it is underscore. Hyphen is -

If it is okay to start an email address with an underscore, add _ to both of the character classes that appear before @

^[-a-zA-Z0-9_][-.a-zA-Z0-9_]*@...

If the email id cannot start with an _, add it only to the second character class:

^[-a-zA-Z0-9][-.a-zA-Z0-9_]*@...

That said, your regex has a couple of issues:

  1. It accepts email addresses starting with a hyphen; is this intended? If not, remove the - from the first character class to make it [a-zA-Z0-9]
  2. It accepts consecutive periods after the first character thereby making [email protected] a valid id - is this status-by-design?

RFC specification for email address is quite complicated. See these threads for more information. Also don't forget to check the one and only perfect and the official regex for validating email addresses (be warned that you might find it a little longer than what sanity would suggest)

Amarghosh
There is no perfect regexpr because emails can't be described with a regular grammer - they need a context sensitive grammer.
Lothar
You could also try using http://code.iamcal.com/php/rfc822/full_regexp.txt to validate email addresses.
FrustratedWithFormsDesigner
+1  A: 

Regular-expressions.info has a very good discussion of e-mail address validation by regex, including his preferred regex for "99% of all e-mail addresses in use today", and another to match e-mail addresses as defined by RFC-2822.

I won't do the author a disservice by copying his work here. But I do think it's worthy of a read since it's directly related to your question.

JMD
A: 

There is also an interesting blog post about email validation on Larry Osterman's website.

This is a followup post to the original post in which he attempts to generate a regular expression to validate an email address. His RegExp is:

string strRegex = @"^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}" +
                  @"\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\" + 
                  @".)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$";

His notes:

The key thing to note in this grammar is that the local-part is almost free-form when it comes to the local part. And there are characters allowed in the local part like !, *, $, etc that are totally legal according to RFC2822 that aren't allowed.

and ...

Adi Oltean pointed out that V2 of the .Net framework contains the System.Net.MailAddress class which contains a built-in validator.

It looks like the System.Net.Mail.MailAddress constructor validates email addresses and you can catch a FormatException to ensure that the email is valid.

Brandan