I need to parse an HTML document to extract all the H1 tags and all HTML between them. I have been playing with HtmlAgilityPack to achieve this with some success. I could extract all H1 tags using:
foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//h1"))
But how do I extract all the HTML after every H1 tag until I hit the next H1 tag? This HTML could include anything from a table/image/link or any other thing on an HTML page but H1 tag.
Thanks in advance.