views:

48

answers:

1

Hey,

Having a small problem for a quick "Search and Highlight" script that I'm working on. I'm using regular expressions because I'd like to do the searching all on client side, after the document has loaded. My search/highlight function goes like this:

function highlight(word, colour, container) {
    var regex = new RegExp("(>[^<]*?)(" + word + ")", "ig");
    var replace = "$1<span name='searchTerm' style='background-color: " + colour + "'>$2</span>";

    if (regex.exec(container.innerHTML)) {
        container.innerHTML = container.innerHTML.replace(regex, replace);
        return true;
    }
    return false;
}

word is the word to search for, colour is the colour to highlight it and container is the element to search in.

Consider an element that contained this:

<ul>
    <li>Set the setting to the correct setting.</li>
</ul>

Say I passed the word "set" to the highlight function. In it's current state, it only finds the first instance of set due to lazy repitition.

So what if I change the regex to this:

var regex = new RegExp("(>[^<]*?)?(" + word + ")", "ig");

This now works great, it highlights all instances of the string "set". But if I pass the search word "li" then it will replace the text inside the tags!

Is there a quick fix for this regular expression to get the behaviour I want? I need it to replace all instances of the search string but not those found as part of a tag. I'd like to keep it client-side using regex.

Thanks!

+5  A: 

You shouldn't be using regex to parse HTML. Walk the DOM tree properly and do a search and replace on pure text.

By the way there's a jQuery plugin that does what you want; you could use it or look at it to get an idea on how to do it:
http://johannburkard.de/blog/programming/javascript/highlight-javascript-text-higlighting-jquery-plugin.html

NullUserException
+1 good advice.
LarsH