views:

70

answers:

3

Hi, I know there are quite some term highlighting questions asked but as far as I know none answers mine. The search terms are put into an array

$keyarray = array("DE", "ABCD");
$string = "ABCDEF";
foreach ($keyarray as $value) {
   $string = str_ireplace($value, "<b>{$value}</b>", $string);
}

The results will obviously be ABCDEF rather than ABCDEF So its there anyway way that I can highlight both terms using a BOLD tag extremely fast using PHP?

A: 

maybe use regexp like: /de|acbd/. regex is searching from left to right and it's not going back so there will be no situtation like <b>1111<b></b>112222</b>

dfens
A: 

You may try this:

$keyarray = array("DE", "ABCD");
$string = "ABCDEF";
$words = explode(' ',$string);
foreach($words as &$w){
   foreach($keyarray as $criteria) {
        if(stripos($w,$criteria)!==false){
            $w = "<b>$w</b>";
                    break;//avoid duplicate wrapping
        }
   }
}
$string = implode(' ',$words);
jerjer
+1  A: 

"Extremely fast" is a relative term. That aside, you have a couple of options:

  • Regular Expressions: if you are very good with regexes, this is a valid use for them. You can also look forward/behind with them, which allows for a fair amount of flexibility.

http://www.regular-expressions.info/lookaround.html

  • Character by character parse: this is often the best way of doing string manipulation (and fastest), but can be the most time-consuming to create.

  • String replaces; fast but there's usually an edge case that doesn't work correctly (speaking from experience).

In all of these scenarios, you can benefit from pre-optimizing your list of terms by sorting/grouping/filtering them appropriately. For example, sorting biggest to smallest length would ensure that you didn't split up a long string (and miss a match) by bolding a shorter string within it.

You could also predetermine optimal regex(es) by examining all the search terms before beginning the replace. Again, this would assume you are pretty savvy with regexes.

Tim
Regex is out of the question, the algorithm have to be as fast as a string replace or faster.Sorting from longer strings to shorter strings help, but there are still cases like the one described above don't work.(Did not impact speed much)Could you elaborate on the character by character parsing method?
vener
@vener - Character by character parsing often means doing a forward read through the string a character at a time into one or more buffers that you perform inexpensive tests on. As I said in my answer, this is MUCH more difficult to write than just a find/replace but it's very powerful because you have full control over peeking vs advancing, match conditions, skip ahead, etc. I recently wrote an XHTML parser in c# that can chew through a sizable page and turn it into an object model in under 5ms. A PHP string is a char array, but not sure how it compares to C++/C#/Java for speed.
Tim