tags:

views:

83

answers:

1

How can I parse HTML tags using c++?

eg:

<html><body>examlpe text </body></html>
+5  A: 

The easiest option would be to use an HTML parsing library. libxml2 is a solid open-source one, although it's technically a C library. You'd need to load your html and then walk through the DOM pulling out all the text() nodes. I don't know that I'd recommend this as your first C++ task.

easel
but there is not even an simple sample program to try it out???
llal
They have a tutorial at http://www.xmlsoft.org/tutorial/ar01s05.html.You could also just count the <> characters and extract everything that's not inside a tag. If this is a homework problem, that's probably the solution they are looking for. I'm not going to write it for you.
easel
@llal: http://www.xmlsoft.org/examples/index.html
Merlyn Morgan-Graham