The assumption is the webpage is coded with correct tags. How can I Convert it to the XML file? I think the most webpages can be viewed as dom tree...How can I convert it to XML file?
JTidy reads HTML and presents it as a DOM. Once you have your HTML as a DOM you should be able to process it and write it out as XML.
To output a DOM, see the example code here and the XMLSerializer in particular.
Brian Agnew
2009-12-14 10:19:52
thanks for the link. How to convert it to xml file?
2009-12-14 10:23:16