Is there any way (in Javascript) to download a remote website (i.e. like with Curl), read it into a string variable and further process it?
You can only download a file from the same domain, as per the Same Origin Policy. You can download content from the same domain though, using the XMLHTTPRequest object:
var xhReq = createXMLHttpRequest();
xhReq.open("GET", "page.html", true);
xhReq.onreadystatechange = onResponse;
xhReq.send(null);
...
function onResponse() {
if (xhReq.readyState != 4) { return; }
var serverResponse = xhReq.responseText;
...
}
There are ways to circumvent the policy, some of them listed in the same Wikipedia page. But it's a hack at best and illegal at worst.
You can use the Yahoo Query Language to query any page on the web.
For example, if you want the full source of the Google homepage, you could use:
select * from html where url="http://google.com" and xpath='/html' limit 1
You'd have to use their JSON callback and reserialize the returned object, but you'd be able to get a full view of the page.
Mostly you won't be allowed. Javascript will prevent you doing this for security reasons. However, you can request json data from other domains using jQuery. Here is an example from the jquery docs that gets some cat pictures from flickr...
$.getJSON("http://api.flickr.com/services/feeds/photos_public.gne?tags=cat&tagmode=any&format=json&jsoncallback=?",
function(data){
$.each(data.items, function(i,item){
$("<img/>").attr("src", item.media.m).appendTo("#images");
if ( i == 4 ) return false;
});
});
You can find this code in the jQuery Docs. As you can see, this makes a request, gets the data back and updates some image tags in the DOM with the cat pictures...