views:

234

answers:

4

I'm using the following 2 methods to highlight the search keywords. It is working fine but fetching partial words also.

For Example:

Text: "This is .net Programming" Search Key Word: "is"

It is highlighting partial word from th*is* and "is"

Please let me know the correct regular expression to highlight the correct match.

private string HighlightSearchKeyWords(string searchKeyWord, string text)
            {
                Regex exp = new Regex(@", ?");
                searchKeyWord = "(\b" + exp.Replace(searchKeyWord, @"|") + "\b)";
                exp = new Regex(searchKeyWord, RegexOptions.Singleline | RegexOptions.IgnoreCase);
                return exp.Replace(text, new MatchEvaluator(MatchEval));
            }




     private string MatchEval(Match match)
            {
                if (match.Groups[1].Success)
                {
                    return "<span class='search-highlight'>" + match.ToString() + "</span>";
                }

                return ""; //no match
            }
+1  A: 

Try this fixed line:

searchKeyWord = @"(\b" + exp.Replace(searchKeyWord, @"|") + @"\b)";
slugster
Thanks a lot.working fine
stackuser1
+2  A: 

You really just need @ before your "(\b" and "\b)" because the string "\b" will not be "\b" as you would expect. But I have also tried making another version with a replacement pattern instead of a full-blown method.

How about this one:

private string keywordPattern(string searchKeyword)
{
    var keywords = searchKeyword.Split(',').Select(k => k.Trim()).Where(k => k != "").Select(k => Regex.Escape(k));

    return @"\b(" + string.Join("|", keywords) + @")\b";
}

private string HighlightSearchKeyWords(string searchKeyword, string text)
{
    var pattern = keywordPattern(searchKeyword);
    Regex exp = new Regex(pattern, RegexOptions.IgnoreCase | RegexOptions.Singleline);
    return exp.Replace(text, @"<span class=""search-highlight"">$0</span>");
}

Usage:

var res = HighlightSearchKeyWords("is,this", "Is this programming? This is .net Programming.");

Result:

<span class="search-highlight">Is</span> <span class="search-highlight">this</span> programming? <span class="search-highlight">This</span> <span class="search-highlight">is</span> .net Programming.

Updated to use \b and a simplified replace pattern. (The old one used (^|\s) instead of the first \b and ($|\s) instead of the last \b. So it would also work on search terms which not only includes word-characters.

Updated to your comma notation for search terms

Updated forgot Regex.Escape - added now. Otherwise searches for "\w" would blow up the thing :)

Updated do to a comment ;)

lasseespeholt
Thanks.. it is working fine.
stackuser1
Instead of what? he has a , notation so the keywords should be split up like: "\bthis|is\b".
lasseespeholt
stackuser1 -> :) But see my last update. Escaping input data is really important to do otherwise your users can break the thing :/
lasseespeholt
It will match false positives if you search with multiple keywords. `HighlightSearchKeyWords("is, blah", "This is .net Programming.") will match both Th*is* and *is*!
Jaroslav Jandek
@Jaroslav I just tried HighlightSearchKeyWords("is,blah", "This is .net Programming."). It works fine?
lasseespeholt
@lasseespeholt: switch it up: "blah,is". Also, you won't be able to separate the values with comma and whitespace.
Jaroslav Jandek
True, but actually it has to be "blah,is" ;)
lasseespeholt
@lasseespeholt: Yeah, I have edited the comment after I tried with a compiler ( can't be sure when you compile in your head :-D). Btw. the OP specifically asked that it should not highlight the partial keywords.
Jaroslav Jandek
@Jaroslav hehe, it did not most of the time :D I have introduced another method to do the keyword pattern now. But the one you have has better performance.
lasseespeholt
asdfsdf dds dfs
stackuser1
asd asd sad dfgdf gdfg
stackuser1
+1  A: 

You need to enclose the keywords in a non-matching group, otherwise you will get false positives (if you are using multiple keywords separated by commas as indicated in the sample)!

private string EscapeKeyWords(string searchKeyWord)
{
    string[] keyWords = searchKeyWord.Split(',');
    for (int i = 0; i < keyWords.Length; i++) keyWords[i] = Regex.Escape(keyWords[i].Trim());

    return String.Join("|", keyWords);
}

private string HighlightSearchKeyWords(string searchKeyWord, string text)
{
    searchKeyWord = @"(\b(?:" + EscapeKeyWords(searchKeyWord) + @")\b)";
    Regex exp = new Regex(searchKeyWord, RegexOptions.Singleline | RegexOptions.IgnoreCase);
    return exp.Replace(text, @"<span class=""search-highlight"">$0</span>");
}
Jaroslav Jandek
Try this one: HighlightSearchKeyWords(" blah ,, is ", "This is .net Programming.") ;) lots of spans. You need to remove empty entities.
lasseespeholt
+1 Thanks for highlighting some issues in my code which you show a solution to.
lasseespeholt
@lasseespeholt: You are right. Whitespaces should not be allowed. The solution is trivial.
Jaroslav Jandek
A: 

Hi, I'm using following method to highlight the keywords in a given text.

private string HighlightSearchKeyWords(string searchKeyWord, string text)
        {
            Regex keywordExp = new Regex(@" ?, ?");
            var pattern = @"\b(" + keywordExp.Replace(Regex.Escape(searchKeyWord), @"|") + @")\b";
            Regex exp = new Regex(pattern, RegexOptions.IgnoreCase | RegexOptions.Singleline);
            return exp.Replace(text, @"<span class=""search-highlight"">$0</span>");

        }

Sample Text: "What is .net Programming? Pl suggest few e-books"

Keyword: ".net"

When i try to search with key word ".net" .net is not getting highlighted in the given sample text.

When i try to search with key word "e-books" e-books is getting highlighted in the given sample text.

What would be the problem. Can anyone pl let me know where exactly do i need modify/

stackuser1