views:

447

answers:

4

I'd like a reg exp which can take a block of string, and find the strings matching the format:

<a href="mailto:[email protected]">....</a>

And for all strings which match this format, it will extract out the email address found after the mailto:. Any thoughts?

This is needed for an internal app and not for any spammer purposes!

+1  A: 

There are plenty of different options on regexp.info

One example would be:

\b[A-Z0-9._%+-]+@(?:[A-Z0-9-]+\.)+[A-Z]{2,4}\b

The "mailto:" is trivial to prepend to that.

exhuma
This regular expression only matches capital letters so make sure to use a case insensitivity flag with this one. Alternatively, consider adding lowercase letter to the regex.
Asaph
can you prepend the mailto: as well as the case insensivity switch asaph has brought up? I'm not familiar with regexp syntax so i can't fix it myself even if its trivial
Click Upvote
+1  A: 
/(mailto:)(.+)(\")/

The second matching group will be the email address.

Doug Hays
which function should i use with this, preg_match?
Click Upvote
A: 

You can work with the internal PHP filter http://us3.php.net/manual/en/book.filter.php

(they have one which is specially there for validating or sanitizing email -> FILTER_VALIDATE_EMAIL)

Greets

How would that allow the email to be extracted out?
Click Upvote
+1  A: 

If you want to match the whole thing from :

$r = '`\<a([^>]+)href\=\"mailto\:([^">]+)\"([^>]*)\>(.*?)\<\/a\>`ism';
preg_match_all($r,$html, $matches, PREG_SET_ORDER);

To fastern and shortern it:

$r = '`\<a([^>]+)href\=\"mailto\:([^">]+)\"([^>]*)\>`ism';
preg_match_all($r,$html, $matches, PREG_SET_ORDER);

The 2nd matching group will be whatever email it is.

Example:

$html ='<div><a href="mailto:[email protected]">test</a></div>';

$r = '`\<a([^>]+)href\=\"mailto\:([^">]+)\"([^>]*)\>(.*?)\<\/a\>`ism';
preg_match_all($r,$html, $matches, PREG_SET_ORDER);
var_dump($matches);

Output:

array(1) {
  [0]=>
  array(5) {
    [0]=>
    string(39) "test"
    [1]=>
    string(1) " "
    [2]=>
    string(13) "[email protected]"
    [3]=>
    string(0) ""
    [4]=>
    string(4) "test"
  }
}
thephpdeveloper