views:

279

answers:

5

I had happen in the past that one of our IT Specialist will move the robots.txt from staging from production accidentally. Blocking google and others from indexing our customers' site in production. Is there a good way of managing this situation?

Thanks in advance.

+1  A: 

Create a deployment script to move the various artifacts (web pages, images, supporting files, etc) and have the IT guy do the move by running your script. Be sure not to include robots.txt in that script.

ahockley
+1  A: 

I'd set up code on the production server which held the production robots.txt in another location and have it monitor the one that's in use.

If they're different, then I'd immediately overwrite the in-use one with the production version. Then it wouldn't matter if it gets overwritten since the bad version won't exists for long. In a UNIX environment, I'd do this periodically with cron.

paxdiablo
+1  A: 

As an SEO, I feel your pain.

Forgive me if I'm wrong, but I'm assuming that the problem is caused because there is a robots.txt on your staging server because you need to block the whole staging environment from the search engines finding and crawling it.

If this is the case, I would suggest your staging environment be placed internally where this isn't an issue. (Intranet-type or network configuration for staging). This can save a lot of search engine issues with that content getting crawled say, for instance, they deleted that robots.txt file from your Staging by accident and get a duplicate site crawled and indexed.

If that isn't an option, recommend staging to be placed in a folder on the server like domain.com/staging/ and use just one robots.txt file in the root folder to block out that /staging/ folder entirely. This way, you don't need to be using two files and you can sleep at night knowing another robots.txt won't be replacing yours.

If THAT isn't an option, maybe ask them to add it to their checklist to NOT move that file? You will just have to check this - A little less sleep, but a little more precaution.

A: 

Why is your staging environment not behind a firewall and not publicly exposed?

The problem is not Robots.txt...The problem is your network infrastructure.

FlySwat
Just a guess, but maybe its the easiest way to expose their work to external clients?
Giovanni Galbo
+1  A: 

Ask your IT guys to change the file permissions on robots.txt to "read-only" for all users, so that it takes the extra steps of:

  1. becoming Administrator/root
  2. changing the permissions to allow writes
  3. overwriting robots.txt with the new file
lennyk