tags:

views:

43

answers:

1

Hi I am trying to scrape other people web pages (for the forces of good not evil). I am currently trying to do this with javascript/jquery from with in a browser. I am finding that the no data is returned from the jquery.get() success call back function.

My code.

$.get('http://www.google.co.uk/', 
    function (data, textStatus, XMLHttpRequest){ 
        alert("status " + textStatus); 
        alert('data:' + data);
        window.data=data;
        window.textStatus=textStatus;
        window.httpReq = XMLHttpRequest});

In my mind this should simply do a get on google store the data in window.data and we are all good. What happens is we get textStatus == success and data == "". the status on the XMLHttpRequest is 4(success).

I have looked at the network traffic using a transparent proxy (Charles) and everything looks find there http status 200 plenty of data being returned.

I am running this just from the Firebug console in Firefox.

Any ideas?

+4  A: 

this will come under cross domain (unless you work for google :) ) which you wont be able to do client side, you could write a server side proxy instead, in another post some one mentioned JSONP as a possibility but i haven't used it so cannot recommend it.

Pharabus
Indeed, server-side is the way to do this stuff. You can use JSONP but IMO it most suited for doing API stuff (say, using the twitter API) and not for doing scraping of entire pages... really, it seems to me, it will be easier to do the scraping the OP needs on the server-side anyway. +1
thenduks