tags:

views:

1622

answers:

10

I have written the regex below for a really simple email validation. I plan to send a confirmation link.

/.*@[a-z0-9.-]*/i

I would, however, like to enhance it from the current state because a string like this does not yield the desired result:

test ,[email protected], test

The "test ," portion is undesirably included in the match. I experimented with word boundaries unsuccessfully.

  1. How should I modify?
  2. Even though I've kept this simple, are there any valid email formats it would exclude?

THANKS!

+12  A: 

Don't use regular expressions to validate e-mail addresses.

alxp
+1 for the link I would have posted (wish I could give +10!)
David Zaslavsky
I think this is a good candidate for a Jeff and Joel podcast rant about what isn't an awesome answer. Sometimes you may just want a heuristic to do something and 98 percent of the time, people with dumb email addresses can go spit.
Peter Turner
"people with weird e-mail addresses can go spit"You're fired.
alxp
Seriously, I'm not sure if it was a good idea, but I made a box for a response form that someone could enter a phone number or an email address in, if they entered something I thought was an email address (using regex) it'd put that email address in the reply to field in the header. Semi-Handy!
Peter Turner
Sometimes I tell websites with stupid regex checks to go spit; by never using them again. I need my gmail + syntax!
Chase Seibert
A: 

Jeffrey Friedl gives a regex for validating email addresses in his Mastering Regular Expressions book. It's huge, but it works well.

John D. Cook
A: 

Google is your friend.

http://www.google.com/search?hl=en&q=regex+email+validation&btnG=Google+Search&aq=f&oq=

There are multiple examples of different methods within the first 5 results.

JustinT
+9  A: 

It's a lot more complicated !!! See Mail::RFC822::Address and be scared...very scared.

an0nym0usc0ward
The first time i saw this Regex it scared me a lot, i showed it to a friend and he didn't believe me what it was THE EMAIL REGEX at first then he was also horrified. Good memories.
Ioxp
A: 

A smaller two step regex provides good results

/** check to see if email address is in a valid format. * Leading character of mailbox must be alpha
* remaining characters alphanumeric plus -_ and dot
* domain base must be at least 2 characters
* domain extension must be at least 2, not more than 4 alpha
* Subdomains are permitted. * @version 050208 added apostrophe as valid char * @version 04/25/07 single letter email address and single
* letter domain names are permitted. */ public static boolean isValidEmailAddress(String address){ String sRegExp;

 // 050208 using the literal that was actually in place
 // 050719 tweaked 
 // 050907 tweaked, for spaces next to @ sign, two letter email left of @ ok
 // 042507 changed to allow single letter email addresses and single letter domain names
 // 080612 added trap and unit test for two adjacent @signs
 sRegExp =  "[a-z0-9#$%&]"    // don't lead with dot
  +  "[a-z0-9#$%&'\\.\\-_]*"  // more stuff dots OK
   +   "@[^\\.\\s@]"    // no dots or space or another @ sign next to @ sign
   +   "[a-z0-9_\\.\\-_]*"   // may or may  not have more character
   + "\\.[a-z]{2,4}";    // ending with top level domain: com,. biz, .de, etc.

    boolean bTestOne =  java.util.regex.Pattern.compile( sRegExp,
            java.util.regex.Pattern.CASE_INSENSITIVE).matcher(address).matches();

    // should this work ?
 boolean bTwoDots =  java.util.regex.Pattern.compile("\\.\\.",  // no adjacent dots
     java.util.regex.Pattern.CASE_INSENSITIVE).matcher(address).find();

 boolean bDotBefore = java.util.regex.Pattern.compile("[\\.\\s]@", //no dots or spaces before @
                      java.util.regex.Pattern.CASE_INSENSITIVE).matcher(address).find();

 return bTestOne && !bTwoDots && !bDotBefore;
}   // end IsValidEmail
A: 

Might be worth using a tried and tested regex. The first link suggests regex's for most common cases:

http://www.regular-expressions.info/email.html

But to properly find out if the address is fully RFC822 compliant then:

http://instantbadger.blogspot.com/2006/08/regex-to-fully-validate-rfc822-email.html

Kev
A: 

this comes from Regex Buddy (definitely a need to buy prog!)

\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b
Keng
+4  A: 

Almost nothing you use that is short enought to make sense looking at it will TRULY validate an email address. With that being said, here is what I typically use:

^\w+([-+.']\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*$

It's actually the built in regex for ASP.NET's regular expression validator for email addresses.

NOTE: many of the regexes given in this thread MAY have worked in the 90's, but TLD's are allowed to be less than 2 characters and more than 4 characters in today's web environment. For example, [email protected] IS a valid email address because .museum is one of those new, long TLDs.

Rick
+2  A: 

Instead of . try matching every character except \s (whitespace):

/[^\s]*@[a-z0-9.-]*/i
Martin Brown
+1  A: 

I found that instead of matching the whole email-address against a regular expression, it is much more practical to just split the string at the @ and:

  • First check for existing MX or A records of the domain part via a DNS-library.
  • Then check the localpart (the part on the left hand side of the @) against a simpler regex.

The reason to do the DNS checking is that unreachable email-addresses albeit RFC-compliant are worth nothing. The reason for additionally checking the A-record is that they are used to determine where to deliver mail to when no MX record is found. (see RFC2821, 3.6)

Further tips:

  • Use a robust DNS resolver library, do not roll your own. Test it against large companies. These sometimes have a huge number of mailservers, which can lead to problems. I've seen a buggy library crap out on bmw.com. Just saying. :)
pi