tags:

views:

484

answers:

1

Hi Guys

I have a wysiwyg editor in my back end, and it is tripping up the first regular expression I wrote. This is in PHP4, using preg_replace(). I'm capturing the URI and linked text.

@<a\shref=\"http[s]?://([^\"]*)\"[]>(.*)<\/a>@siU

The client wanted all external links to open in a new window, so that's the expression I was using to find all (hopefully) external links, but leave internal, page anchor links, etc

I realised the wysiwyg editor also adds style="font-weight: bold" if the user selects bold on the link. I've only recently started learning regular expressions so I'm unsure how to go about this problem.

Any help would be much appreciated!!

+5  A: 

this should match it alright:

/<a\s+([^>]*)href="https?:\/\/([^"]*)"(.*?)>(.*?)<\/a>/

The useful thing here is the lazy match. *? it means that it'll match only as much as it absolutely needs to, as opposed to the regular match, which is greedy.

To demonstrate, with this text:

a b c d a b c d

these regexes will have different results:

/a.*c/    selects: "a b c d a b c"
/a.*?c/   selects: "a b c"
nickf