ansaurus

Question

Answer 1

A:

maybe use regexp like: /de|acbd/. regex is searching from left to right and it's not going back so there will be no situtation like <b>1111<b></b>112222</b>

dfens 2010-10-15 08:06:16

Answer 2

A:

You may try this:

$keyarray = array("DE", "ABCD");
$string = "ABCDEF";
$words = explode(' ',$string);
foreach($words as &$w){
   foreach($keyarray as $criteria) {
        if(stripos($w,$criteria)!==false){
            $w = "<b>$w</b>";
                    break;//avoid duplicate wrapping
        }
   }
}
$string = implode(' ',$words);

jerjer 2010-10-15 08:23:06

Answer 3

+1 A:

"Extremely fast" is a relative term. That aside, you have a couple of options:

Regular Expressions: if you are very good with regexes, this is a valid use for them. You can also look forward/behind with them, which allows for a fair amount of flexibility.

http://www.regular-expressions.info/lookaround.html

Character by character parse: this is often the best way of doing string manipulation (and fastest), but can be the most time-consuming to create.
String replaces; fast but there's usually an edge case that doesn't work correctly (speaking from experience).

In all of these scenarios, you can benefit from pre-optimizing your list of terms by sorting/grouping/filtering them appropriately. For example, sorting biggest to smallest length would ensure that you didn't split up a long string (and miss a match) by bolding a shorter string within it.

You could also predetermine optimal regex(es) by examining all the search terms before beginning the replace. Again, this would assume you are pretty savvy with regexes.

Tim 2010-10-15 08:46:46

Regex is out of the question, the algorithm have to be as fast as a string replace or faster.Sorting from longer strings to shorter strings help, but there are still cases like the one described above don't work.(Did not impact speed much)Could you elaborate on the character by character parsing method?

vener 2010-10-15 15:15:53

@vener - Character by character parsing often means doing a forward read through the string a character at a time into one or more buffers that you perform inexpensive tests on. As I said in my answer, this is MUCH more difficult to write than just a find/replace but it's very powerful because you have full control over peeking vs advancing, match conditions, skip ahead, etc. I recently wrote an XHTML parser in c# that can chew through a sizable page and turn it into an object model in under 5ms. A PHP string is a char array, but not sure how it compares to C++/C#/Java for speed.

Tim 2010-10-15 18:35:26

ansaurus

tags:

views:

answers:

term highlight algorithm (HTML)

related questions