views:

425

answers:

3

I want the same as WebBrowser.Document.Body.InnerHtml, but as an XML representation.

A: 

TidyCOM will clean up HTML to XHTML.

Here's how to use it from C#.

Winston Smith
A: 

IE's document has an expando property named "XMLDocument". You can access it via its IDispatchEx interface.

You can get the document's COM interface via Document.DomDocument.

Sheng Jiang 蒋晟
+2  A: 

Are you using WebBrowser to browse an XML document and want to get to that XML in code, or are you trying to browse to an HTML page and represent HTML as XML?

If the former you can likely just get the raw text from the WebBrowser (maybe InnerText instead of InnerHTML) and parse it as XML.

If the latter, the problem is, HTML isn't XML (unless it's XHTML).

You can convert it to XML with 'tidy' tools but the representation accuracy depends on how well formed the orginal HTML is.

tjmoore