views:

171

answers:

3

Hi there, I recently edited the robots.txt file in my site using a wordpress plugin. However, since i did this, google seems to have removed my site from their search page. I'd appreciate if I could get an expert opinion on why this is so, and a possible solution. I'd initially done it to increase my search ranking by limiting the pages being accessed by google.

This is my robots.txt file in wordpress:

User-agent: *
Disallow: /cgi-bin
Disallow: /wp-admin
Disallow: /wp-includes
Disallow: /wp-content/plugins
Disallow: /wp-content/cache
Disallow: /trackback
Disallow: /feed
Disallow: /comments
Disallow: /category/*/*
Disallow: */trackback
Disallow: */feed
Disallow: */comments
Disallow: /*?*
Disallow: /*?
Allow: /wp-content/uploads

Sitemap: http://www.instant-wine-cellar.co.uk/wp-content/themes/Wineconcepts/Sitemap.xml
+1  A: 

I suggest you use the google web master tools robots.txt checker and put in URLs that are disappearing and ensure that google would still go there.

That way you can verify if it is your robots.txt or something else

Chris
+5  A: 

This is a good robots.txt directive for WordPress. Add the Allow: /wp-content/uploads if you want uploads to be indexed, but that doesn't make sense, as all your images, pdfs, etc., are included in your posts and pages and are indexed there.

User-agent: *
Allow: /
Disallow: /*?s=
Disallow: /wp-admin/*
Disallow: /wp-content/*
Disallow: /wp-includes/*
Disallow: /wp-content/cache
Disallow: /wp-content/themes/*
Disallow: /trackback
Disallow: /comments
Disallow: /category/
Disallow: */trackback
Disallow: */comments

But the most critcal bit of info is in your page source:

<meta name='robots' content='noindex,nofollow' />

That means you have privacy set in Dashboard/Settings/Privacy and that's blocking all search bots even before they get to robots.txt.

Once you get a good robots.txt file and change the Wordpress privacy setting, so to Google webmaster tools and turn up your crawl rate to have Google hit the site faster.

songdogtech
+2  A: 

Note: "You blocked all bots because you're missing the critical Allow: / after User-agent: *" is incorrect. By default, the robots.txt will allow all crawling, you generally do not need to specify any "allow" directives.

However, the "noindex" robots meta tag would be a reason not to index content the site.

Additionally, the robots.txt currently blocks all crawling so that search engines can't tell that the site can be indexed again. If you wish to have the site indexed again, you need to remove the "disallow: /" from the robots.txt file. You can verify that in Google's Webmaster Tools, either by looking up the latest robots.txt file or by using the "Fetch as Googlebot" feature to test crawling of a page on the site.

John Mueller