tags:

views:

1068

answers:

1

I have a large number of email addresses to validate. Initially I parse them with a regexp to throw out the completely crazy ones. I'm left with the ones that look sensible but still might contain errors.

I want to find which addresses have valid domains, so given [email protected] I want to know if it's even possible to send emails to abcxyz.com .

I want to test that to see if it corresponds to a valid A or MX record - is there an easy way to do it using only Python standard library? I'd rather not add an additional dependency to my project just to support this feature.

+8  A: 

There is no DNS interface in the standard library so you will either have to roll your own or use a third party library.

This is not a fast-changing concept though, so the external libraries are stable and well tested.

The one I've used successful for the same task as your question is PyDNS.

A very rough sketch of my code is something like this:

import DNS, smtplib

DNS.DiscoverNameServers()
mx_hosts = DNS.mxlookup(hostname)

# Just doing the mxlookup might be enough for you,
# but do something like this to test for SMTP server
for mx in mx_hosts:
    smtp = smtplib.SMTP()
    #.. if this doesn't raise an exception it is a valid MX host...
    try:
        smtp.connect(mx[1])
    except smtplib.SMTPConnectError:
        continue # try the next MX server in list

Another library that might be better/faster than PyDNS is dnsmodule although it looks like it hasn't had any activity since 2002, compared to PyDNS last update in August 2008.

Edit: I would also like to point out that email addresses can't be easily parsed with a regexp. You are better off using the parseaddr() function in the standard library email.utils module (see my answer to this question for example).

Van Gale