views:

151

answers:

1

The assumption is the webpage is coded with correct tags. How can I Convert it to the XML file? I think the most webpages can be viewed as dom tree...How can I convert it to XML file?

A: 

JTidy reads HTML and presents it as a DOM. Once you have your HTML as a DOM you should be able to process it and write it out as XML.

To output a DOM, see the example code here and the XMLSerializer in particular.

Brian Agnew
thanks for the link. How to convert it to xml file?
Senthil