How to know when a web page is loaded when using QtWebKit?

views:

1169

answers:

+3 Q:

How to know when a web page is loaded when using QtWebKit?

Both QWebFrame and QWebPage have void loadFinished(bool ok) signal which can be used to detect when a web page is completely loaded. The problem is when a web page has some content loaded asynchronously (ajax). How to know when the page is completely loaded in this case?

How are you defining completely loaded?

Is a page completely loaded when no ajax code is currently running? (Even if ajax code might run in the future?)

Is a page completely loaded when no ajax code will run in the future?

What would you do differently having this information? (Why does it matter?)

Bill 2009-08-20 19:13:06

see my answer for answers to your questions

Piotr Dobrogost 2009-08-21 10:18:08

@Bill

Is a page completely loaded when no ajax code is currently running? (Even if ajax code might run in the future?)

Yes. In my case if there is an ajax call it's only one call per page triggered when user submits a form. No timers, not further ajax calls.

Is a page completely loaded when no ajax code will run in the future?

See my answer to the first question.

What would you do differently having this information? (Why does it matter?)

See my answer to the first question.
I have to extract data when it's loaded so I have to know when it's finished loading.

Piotr Dobrogost 2009-08-21 10:17:09

+1 A:

When your initial html/images/etc finishes loading, that's it. It is completely loaded. This fact doesn't change if you then decide to use some javascript to get some extra data, page views or whatever after the fact.

That said, what I suspect you want to do here is expose a QtScript object/interface to your view that you can invoke from your page's script, effectively providing a "callback" into your C++ once you've decided (from the page script) that you've have "completely loaded".

Hope this helps give you a direction to try...

Shaun 2009-08-25 13:22:20

That's not my page, I can't change script it contains so I can't call my script/code from it. Ajax call is triggered by clicking on a data row (I'm simulating these clicks programmatically) and retrieves additional data. I need to read this data after it's loaded so I need to know when loading is completed.

Piotr Dobrogost 2009-08-26 08:09:08

So your trying to use Qt to perform some kind of cross-site scripting? I don't think that's going to work. The only idea that comes to mind is you might watch for the HTTP statuses themselves, which you'd start "watching" after the loaded event.

Shaun 2009-08-26 12:57:32

Why do you think it's not going to work? With QtWebKit you have full access to DOM and JavaScript of the page and you can even call your own JavaScript. You have full access to network layer as well. What do you need more? What do you need HTTP statuses for? It's very low level stuff. All I want and need is to be able to simulate user's actions in the same environment user has during his normal browsing. Which of the features I need for this, does QtWebKit lack in your view?

Piotr Dobrogost 2009-08-28 18:16:24

You said it was someone else's page. I may have misunderstood, but I take it to mean offsite/off-domain? QtWebkit is still sand-boxed IIRC, and what your describing sounds a bit like cross-site scripting. You don't want to know when the _page_ has loaded, you want to know when _someone else's_ script has finished executing. The only thing I can think of would be to watch the HTTP statuses for a hint of when data is going through the request/response cycle.

Shaun 2009-08-28 18:34:04

Maybe you are right and I just can't see it :) I'm not sure what you mean by *cross-site scripting* though. There is only one site here and my app is browsing it the same way a real user would have been doing this. So called *web scraping* if you will. When you write I want to know *when someone else's script has finished executing* you are right. I have to know this because this ends with data being downloaded and I need this data to work with. To be more precise js function only **initiates** data download (that's the whole idea of ajax) and I have to know when this data is downloaded.

Piotr Dobrogost 2009-08-28 20:12:53

+1 A:

I haven't actually done this, but I think you may be able to achieve your solution using QNetworkAccessManager.

You can get the QNetworkAccessManager from your QWebPage using the networkAccessManager() function. QNetworkAccessManager has a signal finished ( QNetworkReply * reply ) which is fired whenever a file is requested by the QWebPage instance.

The finished signal gives you a QNetworkReply instance, from which you can get a copy of the original request made, in order to identify the request.

So, create a slot to attach to the finished signal, use the passed-in QNetworkReply's methods to figure out which file has just finished downloading and if it's your Ajax request, do whatever processing you need to do.

My only caveat is that I've never done this before, so I'm not 100% sure that it would work.

Another alternative might be to use QWebFrame's methods to insert objects into the page's object model and also insert some JavaScript which then notifies your object when the Ajax request is complete. This is a slightly hackier way of doing it, but should definitely work.

EDIT:

The second option seems better to me. The workflow is as follows:

Attach a slot to the QWebFrame::javascriptWindowObjectCleared() signal. At this point, call QWebFrame::evaluateJavascript() to add code similar to the following: window.onload = function() { // page has fully loaded }

Put whatever code you need in that function. You might want to add a QObject to the page via QWebFrame::addToJavaScriptWindowObject() and then call a function on that object. This code will only execute when the page is fully loaded.

Hopefully this answers the question!

Rob Knight 2009-08-26 11:16:03

Your line of thinking is good. It's what I'm doing now and it has been working till now. Now, I have a problem because I call some js function after receiving finished() signal and it has no effect although it should. When I call the same js function manually using a button on a form with a view of my page it works as it should - it sends a post asking about additional data. I guess waiting only on finished() signal of QNAM is not enough as after receiving data QWebFrame has to modify DOM and maybe do other things before it's ready to handle js calls. Please update your answer to reflect this.

Piotr Dobrogost 2009-08-28 11:17:55

Can you give me some more information about exactly what you're trying to achieve?As far as I understand it, it's this:1) You load a page (via QWebView->load() or some other method)2) When the page content (the HTML) has been received, QNAM fires the finished() signal3) When the whole page - including JS files, CSS and images - has finished loading, the QWebPage object fires the loadFinished() signal4) At some later point, extra data is loaded via AjaxAnd you want to know when #4 has happened? Please explain further and I may be able to answer your question fully.

Rob Knight 2009-08-28 13:49:51

ad 1. Not exactly. I'm using QWebFrame::load as I don't need rendering phase at all. Currently however, I **am** using QWebView::setPage to view how the page looks like but this is only for debugging purpose.ad 2. I'm not interested in html alone so I don't use this signal here.ad 3. Yes.ad 4. Yes. The moment of this ajax call is strictly defined; it happens in the moment user clicks on part of a data row. Here is my problem. I'm calling the same js function with evaluateJavaScript and nothing happens; there is no network request being send (I'm monitoring all requests QNAM is sending). TBC

Piotr Dobrogost 2009-08-28 18:00:43

CONTINUED Essentially you already answered my original question. However I suspect that the problem I have now is somehow strongly connected with the answer to the original question. I think that without solving my current problem we can't say with 100% certainty that waiting for finished signal of QNAM is enough to be sure we have loaded and **working** page. By *working* I mean responding to js calls with the same effects as user can observe during normal browsing session.

Piotr Dobrogost 2009-08-28 18:08:48

Oh. I see what you're trying to do now (I think). I'll edit my answer to reflect this.

Rob Knight 2009-08-28 20:27:06

Doesn't attaching to QWebPage::loadFinished signal achieve the same effect as creating window.onload handler?

Piotr Dobrogost 2009-08-28 22:48:34

I've been investigating QNAM a bit more. It seems that in order to monitor all requests, you need to create a new class that inherits from QNetworkAccessManager and override the createRequest() function. In the overridden function, you can add a slot to the finished() signal for each request made. This enables tracking of all requests, not just the main page request. However, there is no guarantee that the page will have finished processing the result immediately after the request is complete. Perhaps you could set a timer to check the result 5 seconds after the request is complete?

Rob Knight 2009-08-29 07:51:35

*However, there is no guarantee (...)* That's why I wrote in my first comment above *I guess waiting only on finished() signal of QNAM is not enough as after receiving data QWebFrame has to modify DOM and maybe do other things before it's ready to handle js calls.* However, I can't afford 5 seconds timer (not even 1 second) as my app is making many requests and this would be too much waiting.

Piotr Dobrogost 2009-08-29 08:14:32

I tried using timer just to check if this would solve the problem at all. It works with timer and the most important thing is it's enough to set timer to only 10ms. This leads me to believe it's so short interval that it's only long enough to leave the function I was calling js from. This in turn leads me to suspect I have some timing issues in my code which are not related directly to the problem we are talking about. It's possible as I'm using QStateMachine and my own command queue (http://stackoverflow.com/questions/1265354). So after our discussion I'm back to debugging...

Piotr Dobrogost 2009-08-29 08:24:23

ansaurus

tags:

views:

answers:

How to know when a web page is loaded when using QtWebKit?

related questions