views:

113

answers:

2

Hello,

I want to crawl a website anonymously without having to rely on an anonymous proxy server. So I was thinking of letting users of my website help me by inserting an invisible IFrame in my template - the IFrame src would be set to a webpage URL I needed, and then uploaded to my server with AJAX. (I can't use AJAX for the downloading because of the same origin policy)

If there a flaw in this? - can a web server determine when their webpages are being accessed directly or through an IFrame from another server? Or is there a better approach?

A: 

If it's a specific web side, I recommend to talk to the website operators rather than trying to crawl anonymously.

Martin v. Löwis
they're a billion dollar company and don't bother talking to little people like me
Plumo
Ok: why do you need the search to be anonymous if they won't bother with you?
Martin v. Löwis
they won't bother talking to me, but they've blocked my IP before. Seems they have an automatic system.
Plumo
+1  A: 

You could use Tor to mask your requests, but if you're having to go such lengths to crawl a website perhaps you shouldn't be doing it?

Also, with your approach the iframe request will include your page URL as the referrer, which makes identifying these requests at the server end pretty straightforward...

Paul Dixon
that's interesting - I didn't think about referrer
Plumo