views:

169

answers:

1

A WinForm application. I want to scrape a part of an HTML web page and save it into a local html file.

I have one local file, "empty.htm" (containing just "I'm empty" in the body), one remote web page, and two WebBrowser controls. WebBrowser1 navigates to the remote page, WebBrowser2 to the local file. Both display their content appropriately.

Now I try:

            string rootIDToCopy = "InterestingDivID";

            HtmlDocument htmlDocument = webBrowser1.Document;
            HtmlElement rootElementToCopy = 
                             htmlDocument.GetElementById(rootIDToCopy);

            if (rootElementToCopy != null)
            {

                HtmlDocument dest = webBrowser2.Document;
                if (dest != null)
                {
                    HtmlElement destBody = dest.Body;  // Point 1

                    destBody.AppendChild(rootElementToCopy); // Point 2
                }
            }

Now, when I'm in Point 1, I see that destBody exists, has no children and has an InnerHTML of "I'm empty". rootElementToCopy appears valid (has three children and an ok InnerHtml). However, at Point 2 I get "Value does not fail within the expected range" (probably from Windows.Forms.UnsafeNativeMethods.IHTMLElement2.InsertAdjacentElement).

Help will be appreciated!

+1  A: 

You may not be allowed to: see WRONG_DOCUMENT_ERR and ownerDocument in the DOM specification.

Instead I think you might have to serialize the subtree to a flat string format before you try to insert it into a different document.

ChrisW
Thank you! This did the trick: destBody.InnerHtml = rootElementToCopy.OuterHtml;I don't like changing modalities like this - now it's a tree of nodes, now it's a string :-(
Avi
I say "may" because I don't know whether "HtmlDocument" obeys this aspect of the DOM specification; maybe it does, except that it's throwing the wrong exception type.
ChrisW

related questions