tags:

views:

19

answers:

1

Apologizing in advance for yet another email pattern matching query. Here is what I have so far:

$text = strtolower($intext);
$lines = preg_split("/[\s]*[\n][\s]*/", $text);
$pattern = '/[A-Za-z0-9_-]+@[A-Za-z0-9_-]+\.([A-Za-z0-9_-][A-Za-z0-9_]+)/';
$pattern1= '/^[^@]+@[a-zA-Z0-9._-]+\.[a-zA-Z]+$/';
foreach ($lines as $email) {
preg_match($pattern,$email,$goodies);
$goodies[0]=filter_var($goodies[0], FILTER_SANITIZE_EMAIL);
if(filter_var($goodies[0], FILTER_VALIDATE_EMAIL)){
array_push($good,$goodies[0]);
}
}

$Pattern works fine but .rr.com addresses (and more issues I am sure) are stripped of .com

$pattern1 only grabs emails that are on a line by themselves.

I am pasting in a whole page of miscellaneous text into a textarea that contains some emails from an old data file I am trying to recover.

Everything works great except for the emails with more than one "." either before or after the "@".

I am sure there must be more issues as well.

I have tried several patterns I have found as well as some i tried to write.

Can someone show me the light here before I pull my remaining hair out?

A: 

How about this?

/((?:\w+[.]*)*(?:\+[^@ \t]*)?@(?:\w+[.])+\w+)/

Explanation: (?:\w+[.])* recognizes 0 or more instances of strings of word characters (alphanumeric + _) optionally separated by strings of periods. Next, (?:\+[^@ \t]*)? recognizes a plus sign followed by zero or more non-whitespace, non-at-sign characters. Then we have the @ sign, and finally (?:\w+[.])+\w+, which matches a sequence of word character strings separated by periods and ending in a word character string. (ie, [subdomain.]domain.topleveldomain)

Aidan Brumsickle
Nope, returns nothing in $good. Thanks though.
Jim_Bo