views:

366

answers:

2

Hi,

I currently have a program that finds and edits HTML files based on finding a tag with a matching id.

I would like to extend it to find a tag that has matching InnerHtml (disregarding capitalization and whitespace)

What is a good way to use Html Agility to do this? I would like to do it using Html Agility because the rest of the program is using it.

Thanks.

+1  A: 

Rough shooting it here but you should be able to do something like this:

            HtmlNodeCollection nodes = doc.DocumentNode.SelectNodes("YOUR_TAG_SELECTOR");

            if (nodes != null)
            {
                foreach (HtmlNode node in nodes)
                {
                    if (node.InnerHtml.ToLower().Trim() == "YOUR_MATCH")
                    {
                        //success routine
                        break;
                    }
                }
            }
Pat
I think this should be node.InnerHtml, not node.InnerText :)
Alex Baranosky
Ahh yes my apologies I read matching text in the original question. Corrected.
Pat
+1  A: 

We've done this using Regular Expressions. Something like this works for us:

private static List<HtmlNode> GetMatchingNodes(string xPath, string pattern, HtmlDocument htmlDocument)
{
    List<HtmlNode> matchingNodes = new List<HtmlNode>();
    foreach (HtmlNode node in htmlDocument.DocumentNode.SelectNodes(xPath))
    {
     if (Regex.IsMatch(node.InnerHtml, pattern))
     {
      matchingNodes.Add(node);
     }
    }
    return matchingNodes;
}

Hope this helps. :)

Scott Ferguson