views:

393

answers:

9

I am coding a site in php and I am currently on the contact us page and I was wondering what was the best way to validate an email address?

  1. By sending a validation link to their email?
  2. Regex
  3. Any other method?

Also could you tell me why and a guide along my way to achieving it? I dont want someone to do the code for me because thats no fun for me and I won't learn but just some guidance on the techniques used to achieve either the methods above.

Also I am going to use these methods to implement a subscribe button on my webpage. Is this the best way to do this? any other methods I should condsider?

+15  A: 

I usually go through these steps

  1. Regex
  2. Send an activation code to the email

if the first step fails it never reaches second step. if the email sending fails because the email doesn't exist I delete the account or do some other stuff

--edit

3 - If for some reason the activation email doesn't get sent, email doesn't get deleted, it stays unapproved for 7 days (or as configured by you), email resending is tried in every 2-3 hours, after those days if no success, email is deleted

4 - If email sent successfully but not activated it stays unapproved but can be reactivated anytime by generating a new activation code

Flakron Bytyqi
I would also like to add that, for user convenience, assign the non-validated user some sort of 'pre-approved' status so that when mail delivery is slow the user can still make use of your services. And also, don't be too strict in your regex, just make sure that it 'kinda sorta looks like an email address', you don't want too many false negatives.
Dennis Haarbrink
@Dennis: I don't think that entering a plausible but potentially false email address is sufficient reason to give the user any more access.
Steven Sudit
please don't use a regex to validate an email address. it isn't possible to write a regex that matches the spec from RFC2822 exactly. you will end up with both false positives and false negatives. false negatives are a big problem because they prohibit valid email addresses from getting through.Jeffrey Friedl developed an email matching regex in "Mastering Regular Expressions". It was something like 7000 characters long, and matched 98% of valid address formats. It's better to just use a library that uses this.
bluesmoon
+3  A: 

That depends on whether or not the user actually wants to recieve a response.

If the user asks a question, he'll want a response and probably give his valid e-mail address. In this case, I'd use a very loose regex check to catch typos or a missing address. (Something like .+@.+.)

If the user does not want to be contacted, but you wanto to know their address, you'll need to work with a validation link. There is no other way to ensure that the e-mail address is valid and belongs to the user.

Jens
+11  A: 

I think the best is a combination of 3. and 1.

In an initial phase you verify syntactically the e-mail (to catch typos):

filter_var($email, FILTER_VALIDATE_EMAIL)

And in a second one you send an e-mail with a confirmation address (to both catch errors and deliberately wrong information).

Artefacto
+1 People need to know about `filter_var()`, it is so awesome.
BoltClock
A: 
 function checkEmail($email) {
  if(preg_match("/^([a-zA-Z0-9])+([a-zA-Z0-9\._-])
  ↪*@([a-zA-Z0-9_-])+([a-zA-Z0-9\._-]+)+$/",
               $email)){
    list($username,$domain)=split('@',$email);
    if(!checkdnsrr($domain,'MX')) {
      return false;
    }
    return true;
  }
  return false;
Your regex rejects many valid email addresses. For example *@example.com, "Hello world"@example.com, someone@[127.0.0.1], someone@[2001:1234:1234::1.2.3.4]
John Burton
While I'd let you off with the ipv6 mismatch, the rest of the regex is just too poor to consider.
symcbean
+3  A: 

The only way to really know if an email is valid or not is to send an email to it. If you really have to, use one of these. Technically, there don't even have to be any periods after the @ for local domains. All that's necessary is a domain follows the @.

Leafy
Wait until people start putting IPV6 colon-delimited values as their domain. :-)
Steven Sudit
+1  A: 

Depends upon your objective. If you must have a valid and active email, then you must send an email that requires verification of receipt. In this case, there is no need for regex validations except as a convenience to your user.

But if your desire is to help the user avoid typos while minimizing user annoyance, validate with regex.

kingjeffrey
+2  A: 

A regex is not really suitable for determining the validity of email address syntax, and the FILTER_VALIDATE_EMAIL option for the filter_var function is rather unreliable too. I use the EmailAddressValidator Class to test email address syntax.

I have put together a few examples of incorrect results returned by filter_var (PHP Version 5.3.2-1ubuntu4.2). There are probably more. Some are admittedly a little extreme, but still worth noting:

RFC 1035 2.3.1. Preferred name syntax
http://tools.ietf.org/search/rfc1035
Summarised as: a domain consists of labels separated by dot separators (not necessarily true for local domains though).

echo filter_var('name@example', FILTER_VALIDATE_EMAIL);
// name@example

RFC 1035 2.3.1. Preferred name syntax
The labels must follow the rules for ARPANET host names. They must start with a letter, and with a letter or digit, and have as interior characters only letters, digits, and hyphen.

echo filter_var('[email protected]', FILTER_VALIDATE_EMAIL);
// name@1example

RFC 2822 3.2.5. Quoted strings
http://tools.ietf.org/html/rfc2822#section-3.2.5
This is valid (although it is rejected by many mail servers):

echo filter_var('name"quoted"string@example', FILTER_VALIDATE_EMAIL);
// FALSE

RFC 5321 4.5.3.1.1. Local-part
http://tools.ietf.org/html/rfc5321#section-4.5.3.1.1
The maximum total length of a user name or other local-part is 64 octets.
Test with 70 characters:

echo filter_var('AbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghij@example.com', FILTER_VALIDATE_EMAIL);
// AbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghij@example.com

RFC 5321 4.5.3.1.2. Domain
http://tools.ietf.org/html/rfc5321#section-4.5.3.1.2
The maximum total length of a domain name or number is 255 octets.
Test with 260 characters:

echo filter_var('name@AbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghij.com', FILTER_VALIDATE_EMAIL);
// name@AbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghijAbcdefghij.com

Have a look at Validate an E-Mail Address with PHP, the Right Way for more information.

Mike
+2  A: 

Before sending off a validation email you could also use checkdnsrr() to verify that the domain exists and does have MX records set up. This will detect emails that use bogus domains (like [email protected]).

function validateEmail($email, $field, $msg = '')
{
    if (!filter_var($email, FILTER_VALIDATE_EMAIL))
    {
        return false;
    }
    list($user, $domain) = explode('@', $email);
    if (function_exists('checkdnsrr') && !checkdnsrr($domain, 'MX'))
    {
        return false;
    }
    return true;
}

We need to use function_exists() to verify checkdnsrr() is available to us because it was not available on Windows before PHP 5.3.

John Conde
Why do you think that MX records are required for email to be delivered? "If no MX records were present, the server falls back to A, that is to say, it makes a request for the A record of the same domain."
sanmai
@sanmai While that may be true in theory you rarely, if ever, see that happen in practice. Plus when it comes to validating email address, with the exception of sending an email to the address and awaiting a response, no automated process is going to be perfect. This method included. But if bad email addresses being provided is a problem this will help to mitigate that.
John Conde
@john-conde I've seen this happen at least a couple of times. Imagine you describing a manager why your valuable client can't register using his working (they checked) email address.
sanmai
@John: I've seen the same thing sanmai has. It's not common, but it's real. Ultimately, checking for an MX record doesn't buy you much, anyhow, so I wouldn't bother.
Steven Sudit
Hmmm. Perhaps then this would best be used as one of several determining factors in determining an email addresses probability of being legit? Used only in conjunction with other tests.
John Conde
+3  A: 

The best way to do it is to send an email with a validation link in it. At the very least if you don't want activation emails, validate the email address. The best email validation function is RFC-compliant email address validator by Dominic Sayers.

Simply include the php file in your project and use it like this:

if (is_email($email, $checkDNS, $diagnose)) //$checkDNS and $diagnose are false by default
    echo 'Email valid';
else
    echo 'Email invalid';
  • If $checkDNS is set to true, it will validate that the domain exists. If the domain don't exist the function return false even if email is valid.
  • If $diagnose is set to true, the function return a code instead of a boolean who will tell you why the email is invalid (or 0 if valid).
AlexV