views:

232

answers:

4

I have a text ($text) and an array of words ($tags). These words in the text should be replaced with links to other pages so they don't break the existing links in the text. In CakePHP there is a method in TextHelper for doing this but it is corrupted and it breaks the existing HTML links in the text. The method suppose to work like this:

$text=Text->highlight($text,$tags,'<a href="/tags/\1">\1</a>',1);

Below there is existing code in CakePHP TextHelper:

function highlight($text, $phrase, $highlighter = '<span class="highlight">\1</span>', $considerHtml = false) {
  if (empty($phrase)) {
    return $text;
  }

  if (is_array($phrase)) {
    $replace = array();
    $with = array();

    foreach ($phrase as $key => $value) {
      $key = $value;
      $value = $highlighter;
      $key = '(' . $key . ')';
      if ($considerHtml) {
        $key = '(?![^<]+>)' . $key . '(?![^<]+>)';
      }
      $replace[] = '|' . $key . '|ix';
      $with[] = empty($value) ? $highlighter : $value;
    }
    return preg_replace($replace, $with, $text);
  } else {
    $phrase = '(' . $phrase . ')';
    if ($considerHtml) {
      $phrase = '(?![^<]+>)' . $phrase . '(?![^<]+>)';
    }

    return preg_replace('|'.$phrase.'|i', $highlighter, $text);
  }
}
A: 

This code works just fine. What you may need to do is check the CSS for the <span class="highlight"> and make sure it is set to some color that will allow you to distinguish that it is high lighted.

.highlight { background-color: #FFE900; }
cdburgess
I am not asking about highlighting. Please read the question.
Amorphous
Will you provide sample `$text` to look at? The code works just fine when I test it. I would say it may be that the `$tags` you are looking to replace may exist inside an already defined `<a href` tag. This code is not designed to swap out the href, but to add it to the word as defined by `$tags`.
cdburgess
A: 

Amorphous - I noticed Gert edited your post. Are the two code fragments exactly as you posted them?

So even though the original code was designed for highlighting, I understand you're trying to repurpose it for generating links - it should, and does work fine for that (tested as posted).

HOWEVER escaping in the first code fragment could be an issue.

$text=Text->highlight($text,$tags,'<a href="/tags/\1">\1</a>',1);

Works fine... but if you use speach marks rather than quote marks the backslashes disappear as escape marks - you need to escape them. If you don't you get %01 links.

The correct way with speach marks is:

$text=Text->highlight($text,$tags,"<a href=\"/tags/\\1\">\\1</a>",1);

(Notice the use of \1 instead of \1)

Rudu
+1  A: 

Replacing text in HTML is fundamentally different than replacing plain text. To determine whether text is part of an HTML tag requires you to find all the tags in order not to consider them. Regex is not really the tool for this.

I would attempt one of the following solutions:

  • Find the positions of all the words. Working from last to first, determine if each is part of a tag. If not, add the anchor.
  • Split the string into blocks. Each block is either a tag or plain text. Run your replacement(s) on the plain text blocks, and re-assemble.

I think the first one is probably a bit more efficient, but more prone to programmer error, so I'll leave it up to you.

If you want to know why I'm not approaching this problem directly, look at all the questions on the site about regex and HTML, and how regex is not a parser.

Ben Doom
+1  A: 

You can see (and run) this algorithm here:

http://www.exorithm.com/algorithm/view/highlight

It can be made a little better and simpler with a few changes, but it still isn't perfect. Though less efficient, I'd recommend one of Ben Doom's solutions.

Mike C