tags:

views:

79

answers:

5

Hi,

How can I look for links in HTML and remove them?

$html = '<p><a href="javascript:doThis('Test Title 1')">Test Title 1</a></p>';
$html .= '<p><a href="javascript:doThis('Test Title 2')">Test Title 2</a></p>';
$html .= '<p><a href="javascript:doThis('Test Title 3')">Test Title 3</a></p>';

$match = '<a href="javascript:doThis('Test Title 2')">';

I want to remove the anchor but display the text. see below.

Test Title 1

Test Title 2

Test Title 3

I've never used Regular Expressions before, but maybe i can avoid it also. Let me know if im not clear.

Thanks

Mark

EDIT: its not a client side thing. I cant use javascript for this. I have a custom CMS and want to edit HTML stored in a Database.

A: 
Franz
its not a client side thing. I have a custom CMS and want to edit HTML stored in a Database.
madphp
Whoops, sorry. You should prefer using PHP's DOM or XML abilites instead of RegEx in that case...
Franz
+1  A: 

You may try the simplest thing:

echo strip_tags($html, '<p>');

This strips all tags except <p>

If you really like regexp:

echo preg_replace('=</?a(\s[^>]*)?>=ims', '', $html);

EDIT:

Delete a - tag AND surrounding tags (code gets messy and doesn't work with broken (X)HTML):

echo preg_replace('=<([a-z]+)[^>]*>\s*<a(\s[^>]*)?>(.*?)</a>\s*</\\1>=ims', '$3', $html);

Howerwer if your problem is that complicated, I recommend that you try xpath.

hegemon
That will strip all links, but how can i search for the link using the match variable. So it will remove that link and the closing tag preceeding it.
madphp
A: 

You might have some joy with Beautiful Soup - http://www.crummy.com/software/BeautifulSoup/ (phython html parsing / maniuplation API)

Kragen
A: 

sed -i -e 's/<a.*<\/a>//g' filename.html

Note that using regular expressions for hacking HTML is a... dubious proposition, but it might just work in practice ;-)

Jonas Kölker
Just to warn you... you will get voted down for this one by some community members...
Franz
You sure Franz? I keep reading thats its ok to use it, if its a small porition of HTML.
madphp
Yeah, I know. But you can almost certainly always figure out a way to make your RegEx not work...
Franz
"[make regex break]" -- I agree. That's why I said using regexes for HTML may be a dubious proposition ||| "[downmod for even suggestion it]" -- well, so be it :( if the HTML is laid out right, which the OP might be in control of, regexes might actually be the best solution: it works and it's easy/fast to hack up. Not the cleanest, sure, but sometimes you just need something that works on the data you have (and not the data you don't have).
Jonas Kölker
+2  A: 

You could see if Simple HTML DOM does the trick.

Yacoby