views:

5618

answers:

3

Before starting writing this question, i was trying to solve following

// 1. navigate to page
// 2. wait until page is downloaded
// 3. read and write some data from/to iframe 
// 4. submit (post) form

The problem was, that if a iframe exists on a web page, DocumentCompleted event would get fired more then once (after each document has been completed). It was highly likely that program would have tried to read data from DOM that was not completed and naturally - fail.

But suddenly while writing this question 'What if' monster inspired me, and i fix'ed the problem, that i was trying to solve. As i failed Google'ing this, i thought it would be nice to post it here.

    private int iframe_counter = 1; // needs to be 1, to pass DCF test
    public bool isLazyMan = default(bool);

    /// <summary>
    /// LOCK to stop inspecting DOM before DCF
    /// </summary>
    public void waitPolice() {
        while (isLazyMan) Application.DoEvents();
    }

    private void webBrowser1_Navigating(object sender, WebBrowserNavigatingEventArgs e) {
        if(!e.TargetFrameName.Equals(""))
            iframe_counter --;
        isLazyMan = true;
    }

    private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e) {
        if (!((WebBrowser)sender).Document.Url.Equals(e.Url))
            iframe_counter++;
        if (((WebBrowser)sender).Document.Window.Frames.Count <= iframe_counter) {//DCF test
            DocumentCompletedFully((WebBrowser)sender,e);
            isLazyMan = false; 
        }
    }

    private void DocumentCompletedFully(WebBrowser sender, WebBrowserDocumentCompletedEventArgs e){
        //code here
    }

For now at least, my 5m hack seems to be working fine.

Maybe i am really failing at querying google or MSDN, but i can not find: "How to use webbrowser control DocumentCompleted event in C# ?"

Remark: After learning a lot about webcontrol, I found that it does FuNKY stuff.

Even if you detect that the document has completed, in most cases it wont stay like that forever. Page update can be done in several ways - frame refresh, ajax like request or server side push (you need to have some control that supports asynchronous communication and has html or JavaScript interop). Also some iframes will never load, so it's not best idea to wait for them forever.

I ended up using:

if (e.Url != wb.Url)
A: 

I had to do something similar. What I do is use ShDocVw directly (adding a reference to all the necessary interop assemblies to my project). Then, I do not add the WebBrowser control to my form, but the AXShDocVw.AxWebBrowser control.

To navigate and wait I use to following method:

private void GotoUrlAndWait(AxWebBrowser wb, string url)
{
    object dummy = null;
    wb.Navigate(url, ref dummy, ref dummy, ref dummy, ref dummy);

    // Wait for the control the be initialized and ready.
    while (wb.ReadyState != SHDocVw.tagREADYSTATE.READYSTATE_COMPLETE)
        Application.DoEvents();
}
Thorsten Dittmar
Just a note to this, it will fail is the page is using Ajax, as the page will never be "Complete".
Kyle Rozendo
+2  A: 

I have yet to find a working solution to this problem online. Hopefully this will make it to the top and save everyone the months of tweaking I spent trying to solve it, and the edge cases associated with it. I have fought over this issue over the years as Microsoft has changed the implementation/reliability of isBusy and document.readystate. With IE8, I had to resort to the following solution. It's similar to the question/answer from Margus with a few exceptions. My code will handle nested frames, javascript/ajax requests and meta-redirects. I have simplified the code for clarity sake, but I also use a timeout function (not included) to reset the webpage after if 5 minutes domAccess still equals false.

private void m_WebBrowser_BeforeNavigate(object pDisp, ref object URL, ref object Flags, ref object TargetFrameName, ref object PostData, ref object Headers, ref bool Cancel)
{
    //Javascript Events Trigger a Before Navigate Twice, but the first event 
    //will contain javascript: in the URL so we can ignore it.
    if (!URL.ToString().ToUpper().StartsWith("JAVASCRIPT:"))
    {
        //indicate the dom is not available
        this.domAccess = false;
        this.activeRequests.Add(URL);
    }
}

private void m_WebBrowser_DocumentComplete(object pDisp, ref object URL) 
{

    this.activeRequests.RemoveAt(0);

    //if pDisp Matches the main activex instance then we are done.
    if (pDisp.Equals((SHDocVw.WebBrowser)m_WebBrowser.ActiveXInstance)) 
    {
        //Top Window has finished rendering 
        //Since it will always render last, clear the active requests.
        //This solves Meta Redirects causing out of sync request counts
        this.activeRequests.Clear();
    }
    else if (m_WebBrowser.Document != null)
    {
        //Some iframe completed dom render
    }

    //Record the final complete URL for reference
    if (this.activeRequests.Count == 0)
    {
        //Finished downloading page - dom access ready
        this.domAccess = true;
    }
}
Marcus Pope
could you perhaps elaborate on the differences to previous IE versions?
peterchen
Marcus Pope
And if I recall correctly AJAX requests did not trigger these event properly until IE6, and since ie7 or ie8 they now trigger duplicate before navigate events. And don't even bother with navigate complete or download complete events as they won't help you in determining the completion status of a navigation lifecycle.
Marcus Pope
+2  A: 

You might want to know the AJAX calls as well.

Consider using this:

private void webBrowser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
    string url = e.Url.ToString();
    if (!(url.StartsWith("http://") || url.StartsWith("https://")))
    {
            // in AJAX
    }

    if (e.Url.AbsolutePath != this.webBrowser.Url.AbsolutePath)
    {
            // REAL DOCUMENT COMPLETE
    }
    else
    {
            // IFRAME 
    }
}
Yuki
+1, but the else part would contain the REAL DOCUMENT COMPLETE where as the if condition would be the IFRAME
pug