ansaurus

Question

Highlight Search Terms in PHP without breaking anchor tags using regex

Answer 1

A:

I think assertions is what your looking for.

Ed G 2010-02-18 21:41:30

A little more detail would be nice. Actually, make that *a lot* more detail.

Alan Moore 2010-02-19 04:16:03

Answer 2

+2 A:

DO NOT try to parse HTML with regular expressions:
http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags

Try something like PHP Simple HTML DOM.

<?php
// get DOM
$html = file_get_html('http://www.google.com/search?q=hello+kitty');

// ensure this is properly sanitized.
$term = trim($term);

// highlight $term in all <div class="result">...</div> elements
foreach($html->find('div.result') as $e){
   echo str_replace($term, '<span class="highlight">'.$term.'</span>', $e->plaintext);
}
?>

Note: this is not an exact solution because I don't know what your HTML looks like, but this should put you pretty close to being on track.

macek 2010-02-18 22:40:39

+1. A regex might do the trick or it might not, but this way is simpler, and much easier to maintain.

Alan Moore 2010-02-18 23:16:17

Agreed. Regex just isn't suited for parsing HTML; it was never designed for that.

macek 2010-02-18 23:38:36

I Also agree that Regex isn't suited for parsing HTML, but after implementing this solution, i might try using the route of stripping html tags before I regex and then spit out a plain text version of the search results. The time it took for the page to load using this route took considerably longer than regex'ng.

Tim Schoffelman 2010-02-19 20:35:06

Answer 3

A:

I ended up going this route, which so far, works well for this specific situation.

<?php

if(preg_match('|\b(' . $term . ')\b|i', $str_content))
{
    $str_content = strip_tags($str_content);
    $str_content = preg_replace('|\b(' . $term . ')(?!["\'])|i', "<span class=\"highlight\">$1</span>", $str_content);
    $str_content = preg_replace('|\n[^<]+|', '</p><p>', $str_content);
    break;
}

?>

It's still html encoded, but it's easier to parse through now without html tags

Tim Schoffelman 2010-02-19 21:29:27

ansaurus

tags:

views:

answers:

Highlight Search Terms in PHP without breaking anchor tags using regex

related questions