views:

179

answers:

1

I have a string full of html & which reads

Dim strHml as string = "<html><head><title></title></head><body><div class="normal">Dog</div>
<div class="normal">Cat</div><div class="normal">Elephant</div><div class="normal">Giraffe</div><div class="normal"><div><p>Random Div</p></div>Lion</div><div>Wolf</div>
<div>Tiger</div></body></html>"

I want to somehow be able to pull all the div tags and their content and put each one into an array have looked at split function and regular expressions but no clear and easy solution has presented itself as yet.

I have amended this slightly to incorporate nested div tags, but those tags I still need returning in the format :-

<div class="normal"><div><p>Random Div</p></div>Lion</div>
+2  A: 

I tested this in vb.net using regex.

Is that what you needed?

Dim reg = New Regex("<div>(.*?)</div>")

        Dim matches = reg.Matches(strHml)

        For Each mat As Match In matches
            Dim s As String
            s = mat.Value
        Next mat
astander
That will work if: there are no nested div's and the div('s) do not span multiple lines.
Bart Kiers
Yes, i aggree, what th OP needs to specify is what they want in the case of nested divs, and can remove the multi line by removing tabs and end lines. Other than that i would try a html parserhttp://www.codeguru.com/vb/vb_internet/html/article.php/c4815http://www.netomatix.com/products/Documentmanagement/HtmlParserNet.aspx
astander
Yes, html-parser++
Bart Kiers