views:

885

answers:

7

I need to validate the email address of my users. Unfortunately, making a validator that comforms to standards is hard

Here is an example of a regex expression that tries to comform to standard

Are there any PHP library (preferably, open-source) that validates email address?

+4  A: 

I found a library in google code: http://code.google.com/p/php-email-address-validation/

Are there any others?

MrValdez
I use its predecessor in several projects and haven't had any issues so far. I say go for it.
_Lasar
+5  A: 

Have you looked at PHP's filter_ functions? They're not perfect, but they do a fairly decent job in my experience.

Example usage (returns boolean):

filter_var($someEmail, FILTER_VALIDATE_EMAIL);

mrhahn
I'm coding up a simple (non-Enterprise) app that needs a basic validator. While FILTER_VALIDATE_EMAIL doesn't seem to fully implement the spec -- for example, it fails some of the tests at http://code.iamcal.com/php/rfc822/rfc822.phps -- it seems to be "good enough" for my current needs. Thanks!
Jon Schneider
+8  A: 

AFAIK, the only good way to validate an e-mail is to to send an e-mail and see if user goes back to the site using a link in this e-mail. That's what lot of sites do.

As you point out with the link to the well known mammoth regex, validating all forms of e-mail address is hard, near to impossible. It is so easy to do it wrong, even for trivial style e-mails (I found too many sites rejecting caps in e-mail addresses! And most old regexes reject TLDs of more than 4 letters!).

AFAIK, "Jean-Luc B. O'Grady"@example.com and e=m.c^2@[82.128.45.117] are both valid addresses... While [email protected] is likely to be invalid.

So somehow, I would just check that we have something, a unique @, something else, and go with it: it would catch most user errors (like empty field or user name instead of e-mail address).
If user wants to give a fake address, it would just give something random looking correct ([email protected] or [email protected]). And no validator will catch typos ([email protected] instead of [email protected]).

If one really want to validate e-mails against full RFC, I would advise to use regexes to split around @, then check separately local name and domain name. Separate case of local name starting with " from other cases, etc. Separate case of domain name starting with [ from other cases, etc. Split problem in smaller specific domains, and use regexes only on a well defined, simpler cases.
This advice can be applied to lot of regex uses, of course...

PhiLho
That is a valid answer and what I would normally do. But for this particular client, I needed a way to check if an email is valid.The alternative to email validation is asking the user to input their emails twice, which I strongly disprove of. Hopefully, I can convince them not to do this.
MrValdez
What's wrong with double-entry? It's a quick, cheap, and easy way to ensure that the user didn't fat-finger their email address while typing it in. If you *really* need to verify that an email address has a valid format, Dominic's answer (and linked site) seems to contain the most comprehensive information I've seen yet.
afrazier
@afrazier: "What's wrong with double-entry?"Well, if people do like me, they just copy/paste the first entry into the second one, so the benefit is null...
PhiLho
+1  A: 

Zend_Validate includes an email validator.

There are plenty of regular expressions around for validating - everything from very basic to very advanced. You really should pick something that matches the importance of a valid email in your application.

Rexxars
A: 

i'd recommend to look at the source code of Zend_Validate_EmailAddress [source].

once you have your dependencies fixed you can simply do the following:

$mail_validator = new Zend_Validate_EmailAddress();
$mail_validator->isValid($address);   // returns true or false

best would be to get the full Zend Library into your project via svn external and point the include path to it...

but you can just download the necessary files (1,2,3,4,5,6), and include them all (remove the require_once calls)

Pierre Spring
+8  A: 

Cal Henderson (of Flickr) wrote an RFC822 compliant email address matcher, with an explanation of the RFC and code utilizing the RFC to match email addresses. I've been using it for quite some time now with no complaints.

RFC822 (published in 1982) defines, amongst other things, the format for internet text message (email) addresses. You can find the RFC's by googling - there's many many copies of them online. They're a little terse and weirdly formatted, but with a little effort we can seewhat they're getting at.

... Update ...

As Porges pointed out in the comments, the library on the link is outdated, but that page has a link to an updated version.

enobrev
Nice, thanks for the link.
Philip Morton
Worth noting that RFC822 is ancient, as the quote identifies. In fact, the RFC that obsoletes 822 (2822) is *also* obsolete, which shows you how out-of-date it is :) The current RFC for email addresses is 5322, published the month of this answer!
Porges
+7  A: 

I've now collated test cases from Cal Henderson, Dave Child, Phil Haack, Doug Lovell and RFC 3696. 158 test addresses in all.

I ran all these tests against all the validators I could find. The comparison is here: http://www.dominicsayers.com/isemail

I'll try to keep this page up-to-date as people enhance their validators. Thanks to Cal, Dave and Phil for their help and co-operation in compiling these tests and constructive criticism of my own validator.

People should be aware of the errata against RFC 3696 in particular. Three of the canonical examples are in fact invalid addresses. And the maximum length of an address is 254 or 256 characters, not 320.

Dominic Sayers
Thanks for doing this; it's awesome to have real data to work off of, instead of just speculation. Can you also include the other libraries mentioned on this page?
Anirvan