tags:

views:

70

answers:

2

I want to make a site where there user can basically navigate the web from within an iframe. The catch is that I'd like to be able to have more control over what is rendered within the iframe. Specifically, I'd like to be able to filter out images or text, disable forms etc. I'd also like to be able to gather feedback such as what links the users clicked on. Is this even possible using a standard back-end scripting language (like php), with html and javascript on the frontend? Would I first need to grab the source of the site before it is rendered, then do whatever manipulation is necessary, and finally re-render it somehow? Could somebody please explain the programming flow that would occur here (assuming its possible)? Thanks.

+1  A: 

I think you would probably want to grab the source of the of site (with server-side code) before rendering it. You might run into cross-site scripting issues if you try to use JavaScript. Your iframe would load a page like render.php and pass the address of the page to render os a querystring parameter. Then use regular expressions to find elements in the HTML that render.php downloads from the address. Rewrite the HTML as necessary and then write it all out to the iframe.

Rewrite links so that that the user is taken to a page you control and redirected onto a target site if you want to track where people are going. Example: a link in the page needs to go to google.com. You would send them to tracker.php?target=http://google.com. You control tracker.php and can log each load of this page and then redirect the user to the target site.

Update:

Another possible solution is to use Apache or other server to proxy the target website. There are modules like mod_proxy for this. There may also be modules that let you parse the HTML or you could roll your own.

I should point out that even the best solutions offered to your question will be somewhat brittle if you do not have full control over the target site. You will want to have lots of error handling or alerting.

BrianLy
very clever answer, thanks. Do you forsee any issues with rewriting links on the sites (which would cause the site to break)? The only thing i can think of is if the links are javascript links..this would make it hard to control the rewriting..
es11
JavaScript links as you say may be difficult to handle in some cases. Maybe just identify and skip them? Dealing with POSTs could be difficult in some cases. You might also want to cache your version of the processed content to avoid unnecessary requests and parsing.
BrianLy
A: 

You can have a look at this. It uses iFrame really well, and maybe even use the library it has.

aredkid
this provides a cool effect for the iframe (which I might use!) but doesn't really answer the technical aspects of what I asked (unless im missing the point)...thanks for your input though
es11