views:

84

answers:

3

Hi, i need to remove all email addresses and links from a string and replace them with "[removed]" and i'm a bit lost on how to do it.
Can someone help me on this?
Thanks.

+1  A: 

You can use preg_replace to do it.

for emails:

$pattern = "/[^@\s]*@[^@\s]*\.[^@\s]*/";
$replacement = "[removed]";
preg_replace($pattern, $replacement, $string);

for urls:

$pattern = "/[a-zA-Z]*[:\/\/]*[A-Za-z0-9\-_]+\.+[A-Za-z0-9\.\/%&=\?\-_]+/i";
$replacement = "[removed]";
preg_replace($pattern, $replacement, $string);

Resources

PHP manual entry: http://php.net/manual/en/function.preg-replace.php

Credit where credit is due: email regex taken from preg_match manpage, and URL regex taken from: http://www.weberdev.com/get_example-4227.html

Josiah
hmm just tried and removes the whole text i had....
JEagle
Can you post a small sample of the text?
Josiah
It was just a random text i had. Nothing specific, just some email address and some links
JEagle
Actually, I found the problem. Try my edited code :)
Josiah
It works. Thanks.
JEagle
Glad to help :)
Josiah
+1  A: 

Try this:

$patterns = array('<[\w.]+@[\w.]+>', '<\w{3,6}:(?:(?://)|(?:\\\\))[^\s]+>');
$matches = array('[email removed]', '[link removed]');
$newString = preg_replace($patterns, $matches, $stringToBeMatched);

Note: you can pass an array of patterns and matches into preg_replace instead of running it twice.

treeface
thanks. it worked, but it dowsn't work with www.site.com
JEagle
www.site.com isn't a link. you want to remove URL's as well?
Fosco
oops... yes, please
JEagle
A: 

The answer I was going to upvote was deleted. It linked to a Linux Journal article Validate an E-Mail Address with PHP, the Right Way that points out what's wrong with almost every email regex anyone proposes.

The range of valid forms of an email address is much broader than most people think.

Stephen P