views:

63

answers:

1

I'm quite sure this has been asked before but I can't for the life of me find anything.

A client of mine has a number of pages that we closed to the public today. Because image URLs associated with those pages are still valid (the pages must continue to be visible internally for maintenance), the page is obviously still fully visible from the Google cache, which understandably annoys my client.

I would like to fix this using a mod_rewrite directive, 403'ing or 404'ing any requests to that image directory that do not have a REFERER that starts with that site's domain (i.e. are hotlinked to by the pages in the cache).

Update: This works for me!

RewriteCond %{REQUEST_URI} ^/imagedir
RewriteCond %{HTTP_REFERER} !^http://(www\.)?domain\.com [NC]
RewriteRule .*  - [F,L]
A: 

Hi,

The HTTP_REFERER header is often filtered out by security software and proxies because it's seen as a privacy issue. Before proceeding, examine logs and look at the percentage of requests which has the HTTP_REFERER set and do an impact analysis before proceeding.

You can change google's behaviour using robots.txt and also utilize cache-control. It probably take some time before it's out of the index. Use this next time you have a 'campaign' that way the issue you're facing wont't happen again :-)

http://www.robotstxt.org/


The answer is on stackoverflow's sister site:

http://serverfault.com/questions/71020/modrewrite-how-do-i-check-the-httpreferers-querystring

tovare
Cheers @tovare, yes, both robots.txt and cache-control are already in place for next time :) In this very special case, it's o.k. to go ahead and filter by HTTP_REFERER, because the site is closed and the limited number of people allowed to access the pages does not use security software.
Pekka
Ok ... then go ahead :) Cheers
tovare
Thanks for the SF link! Would +1 but i'm out of votes for today.
Pekka
Best answer is the better kudos :) Thanks.
tovare