views:

190

answers:

2

So I need to pull some javascrit out of a remote page that has (worthless) html combined with (useful) javascript. The page, call it, http://remote.com/data.html, looks something like this (crazy I know):

<html>
<body>
<img src="/images/a.gif" />
<div>blah blah blah</div><br/><br/>
var data = { date: "2009-03-15", data: "Some Data Here" };

</body>
</html>

so, I need to load this data variable in my local page and use it.

I'd perfer to do so with completely client side code. I figured, if I could get the html of this page into a local javascript variable, I could parse out the javascript code, run eval on it and be good to use the data. So I thought load the remote page in an iframe, but I can't seem to find the iframe in the dom. Why not?:

<script>
alert(window.parent.frames.length);
alert(document.getElementById('my_frame'));
</script>

<iframe name="my_frame" id='my_frame' style='height:1px; width:1px;' frameBorder=0 src='http://remote.com/data.html'&gt;&lt;/iframe&gt;

The first alert shows 0, the second null, which makes no sense. How can I get around this problem?

+2  A: 

Have you tried switching the order - i.e. iframe first, script next? The script runs before the iframe is inserted into the DOM.

Also, this worked for me in a similar situation: give the iframe an onload handler:

<iframe src="http://example.com/blah" onload="do_some_stuff_with_the_iframe()"></iframe>

Last but not least, pay attention to the cross-site scripting issues - the iframe may be loaded, but your JS may not be allowed to access it.

Piskvor
using onload() worked, but as you suspected, I'm running into xss issues now. I guess I'll have to use some kind of server side solutions.
Tristan Havelick
A: 

One option is to use XMLHttpRequest to retrieve the page, although it is apparently only currently being implemented for cross-site requests.

I understand that you might want to make a tool that used the client's internet connection to retrieve the html page (for security or legal reasons), so it is a legitimate hope.

If you do end up needing to do it server-side, then perhaps a simple php page that takes a url as a query and returns a json chunk containing the script in a string. That way if you do find you need to filter out certain websites, you need only do this in one place.

The inevitable problem is that some of the users will be hostile, and they then have a license to abuse what is effectively a javascript proxy. As a result, the safest option may be to do all the processing on the server, and not allow certain javascript function calls (eval, http requests, etc).

Phil H