tags:

views:

699

answers:

2

I would like to be able to switch this...

My sample [a id="keyword" href="someURLkeyword"] test keyword test[/a] link this keyword here.

To...

My sample [a id="keyword" href="someURLkeyword"] test keyword test[/a] link this [a href="url"]keyword[/a] here.

I can't simply replace all instances of "keyword" because some are used in or within an existing anchor tag.

Note: Using PHP5 preg_replace on Linux.

+2  A: 

You can't do this with regular expressions alone. Regular expressions are context free -- they simply match a pattern, without regard to the surroundings. To do what you want, you need to parse the source out into an abstract representation, and then transfor it into your target output.

troelskn
Can't "lookahead" and "lookbehind" functionality be used to account for the matched pattern's surroundings?
Joe
You might be able to use lookarounds, but it would be ridiculously difficult. I suggest you look into preg_replace_callback. Search for either a complete anchor element, or the keyword. If you match an anchor element, plug it back in; if you match a bare keyword, add the tags.
Alan Moore
+1  A: 

Using regular expressions may not be the best way to solve this problem, but here is a quick solution:

function link_keywords($str, $keyword, $url) {
    $keyword = preg_quote($keyword, '/');
    $url = htmlspecialchars($url);

    // Use split the string on all <a> tags, keeping the matched delimiters:
    $split_str = preg_split('#(<a\s.*?</a>)#i', $str, -1, PREG_SPLIT_DELIM_CAPTURE);

    // loop through the results and process the sections between <a> tags
    $result = '';
    foreach ($split_str as $sub_str) {
        if (preg_match('#^<a\s.*?</a>$#i', $sub_str)) {
            $result .= $sub_str;
        } else {
            // split on all remaining tags
            $split_sub_str = preg_split('/(<.+?>)/', $sub_str, -1, PREG_SPLIT_DELIM_CAPTURE);
            foreach ($split_sub_str as $sub_sub_str) {
                if (preg_match('/^<.+>$/', $sub_sub_str)) {
                    $result .= $sub_sub_str;
                } else {
                    $result .= preg_replace('/'.$keyword.'/', '<a href="'.$url.'">$0</a>', $sub_sub_str);
                }
            }
        }
    }
    return $result;
}

The general idea is to split the string into links and everything else. Then split everything outside of a link tag into tags and plain text and insert links into the plain text. That will prevent [p class="keyword"] from being expanded to [p class="[a href="url"]keyword[/a]"].

Again, I would try to find a simpler solution that does not involve regular expressions.

mcrumley