views:

159

answers:

2

Hi guys,

I have an API call that essentially returns the HTML of a hosted wiki application page. I'm then doing some substr, str_replace and preg_replace kung-fu to format it as per my sites style guides.

I do one set of calls to format my left nav (changing a link to pageX to my wikiParse?page=pageX type of thing). I can safely do this on the left nav. In the body text, however, I cannot safely assume a link is a link to an internal page. It could very well be a link to an external resource. So I need to do a preg_replace that matches href= that is not followed by http://.

Here is my stab at it:

$result = preg_replace('href\=\"(?!http\:\/\/)','href="bla?id=',$result);

This seems to strip out the entire contents on the page. Anyone see where I slipped up? I don't think I'm too far off, just can't see where to go next.

Cheers

+1  A: 

The preg_* functions expect Perl-Compatible Regular Expressions (PCRE). The structural difference to normal regular expressions is that the expression itself is wrapped into delimiters that separate the expression from possible modifiers. The classic delimiter is the / but PHP allows any other non-alphanumeric character except the backslash character. See also Intruduction to PCRE in PHP.

So try this:

$result = preg_replace('/href="(?!http:\/\/)/', 'href="bla?id=', $result);

Here href="(?!http://) is the regular expression. But as we use / as delimiters, the occurences of / inside the regular expression must be escaped using backslashes.

Gumbo
Good eye! I knew it'd be something obvious...Thanks.
Cory Dee
+1  A: 

Your regexp is missing starting and ending delimiters (by default '/');

$result = preg_replace('/href\=\"(?!http\:\/\/)/','href="bla?id=',$result);
vartec