views:

216

answers:

1

If a page has fully loaded on a QWebView, how can I get the data for a certain image (probably through the dom?)

A: 

I'll try taking a stab at this:

If you want to get the url of an image using jQuery you could use an approach like this:

import sys
from PyQt4.QtCore import *
from PyQt4.QtGui import *
from PyQt4.QtWebKit import *
app = QApplication(sys.argv)
web = QWebView()
web.load(QUrl("http://google.com"))
frame = web.page().mainFrame()

web.show()

def loadFinished(ok):
    print 'loaded'
    frame.evaluateJavaScript("""
    //this is a hack to load an external javascript script 
    //credit to Vincent Robert from http://stackoverflow.com/questions/756382/bookmarklet-wait-until-javascript-is-loaded
    function loadScript(url, callback)
{
        var head = document.getElementsByTagName("head")[0];
        var script = document.createElement("script");
        script.src = url;
        // Attach handlers
        var done = false;
        script.onload = script.onreadystatechange = function()
        {
                if( !done && ( !this.readyState 
                                        || this.readyState == "loaded" 
                                        || this.readyState == "complete") )
                {
                        done = true;
                        // Continue your code
                        callback();
                }
        };

        head.appendChild(script);
}

// This code loads jQuery and executes some code when jQuery is loaded, using above trick
loadScript("http://code.jquery.com/jquery-latest.js", function(){
    //we can inject an image into the page like this:
    $(document.body).append('<img src="http://catsplanet.files.wordpress.com/2009/08/kitten_01.jpg" id="kitten"/>');
    //you can get the url before the image loads like so:
        //detectedKittenImageUrl = $('#kitten').attr('src');
        //alert('detectedKittenImageUrl = ' + detectedKittenImageUrl);
    //but this is how to get the url after it is loaded, by using jquery to bind to it's load function:
    $('#kitten').bind('load',function(){
        //the injected image has loaded
        detectedKittenImageUrl = $('#kitten').attr('src');
        alert('detectedKittenImageUrl = ' + detectedKittenImageUrl);
        //Google's logo image url is provided by css as opposed to using an IMG tag:
        //it has probabled loaded befor the kitten image which was injected after load
        //we can get the url of Google's logo like so:
        detectedGoogleLogoImageUrl = $('#logo').css('background-image');
        alert('detectedGoogleLogoImageUrl = ' + detectedGoogleLogoImageUrl);
    });

});

    """) 

app.connect(web, SIGNAL("loadFinished(bool)"), loadFinished)

sys.exit(app.exec_())

If you didn't want to load jquery from the web each time you could download jquery then inject like this:

jQuerySource = open('jquery.min.js').read()
frame.evaluateJavaScript(jQuerySource)

you could also not use jQuery at all, but it often makes manipulation easier, depending on what else you want to do.

If you want to get the image content as a bitmap not a url, it MAY be possible using a html canvas object, I'm not sure if you will run into cross-domain security problems. Another approach would be to use pyQT to get the image as it appears. If you have a PNG with alpha-transparency, this would be more complex though, but for an opaque JPEG, for example it would be easier. You could Google around for some webpage screenshot code for how to do that or you could download from the found url in Python. Once you had the url variable in Javascript, you would probably have to use the cross-the-boarder technique featured on this great slideshow to get the variable into Python for downloading.

http://www.sivachandran.in/index.php/blogs/web-automation-using-pyqt4-and-jquery may be useful example code too.

Luke Stanley