How can i get the content from HTML, removing the elements around it.
I am looking for an example using VB6
How can i get the content from HTML, removing the elements around it.
I am looking for an example using VB6
You can use Regular Expression; build your pattern and extract the data that you want from HTML. In this link you might find out how you can use Regular Expression in vb6 http://www.regular-expressions.info/vb.html
The HTML may be mal-formed, making it very difficult to remove the tags with regular expressions. An alternative is to load Internet Explorer as a COM object in VB, and then load the HTML doc in Internet Explorer and use it to walk through the interpreted element tree.
You can use Internet Explorer as a COM object (without showing it on screen). For example to get a plain-text version of the HTML:
Public Function Html2Text(ByVal Data _
As String) As String
Dim obj As Object
On Error Resume Next
Set obj = _
CreateObject("htmlfile")
obj.Open
obj.Write Data
Html2Text = obj.Body.InnerText
End Function
You could also walk the element tree to do something more complicated.
Credit: Karl Peterson in Visual Studio Magazine.