views:

40

answers:

1

I have a webbrowser control which I navigate to an URL that contains this html:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"&gt;
<html xmlns="http://www.w3.org/1999/xhtml"&gt;
<head>
    <meta http-equiv="content-type" content="text/html;charset=utf-8" />
    <title></title>
</head>
<body marginheight="60" topmargin="60">
    <p align="center"><img src="nocontent.jpg" alt="" height="434" width="525" border="0" /></p>
</body>
</html>

But when I use this code to fetch the source:

HTMLDocument objHtmlDoc = (HTMLDocument)browser.Document.DomDocument;
string pageSource = objHtmlDoc.documentElement.innerHTML;
Console.WriteLine(pageSource);

This is the result:

<HEAD><TITLE></TITLE>
<META content=text/html;charset=utf-8 http-equiv=content-type></HEAD>
<BODY topMargin=60 marginheight="60">
<P align=center><IMG border=0 alt="" src="nocontent.jpg" width=525 height=434></P></BODY>

This is no good for further processing, how can I make sure it shows the same source as when I would rightclick it and select "view source"?

+6  A: 

Use browser.DocumentText to obtain the source HTML.

Using the HTMLDocument class will cause it to generate HTML from the conceptual model of the document rather than displaying the original source.

Adam Robinson