views:

1006

answers:

5

When does Google re-crawl a site? And why does Google have two versions of the same page in Cache??

http://forum.portal.edu.ro/index.php?showtopic=112733 cache pages are: forum.portal.edu.ro/index.php?showtopic=112733&st=25/ forum.portal.edu.ro/index.php?showtopic=112733&st=50

+3  A: 

There's alot of discussion regarding Google's crawling policy. The best you can do is check your logs and determine what their schedule is for your site.

As for the multiple entries in the cache, Google has no way of knowing that they aren't the same page; they have different URLs and possibly different data. If you want a specific page to be used, try using <link rel="canonical" href="(standard URL)">.

lacqui
A: 

You can increase the rate at which it crawls, by adjusting:

Site Configuration > Settings > Set Custom Crawl Rate

Chris Missal
+1  A: 

How often a page is re-crawled depends on how high it's ranking is, and what update interval you have suggested in your site map. Some other factors may also be taken into account, like the content of the page, and which type of sites that link to it.

The two pages in the cache aren't at all the same page, one is page two in the thread and the other is page three. As they have different URLs and different content they are separate pages.

If you really want the pages to be counted as the same by search engines, you can use a link tag with rel="canonical" to point back to the first page of the thread.

Guffa
A: 

It depends on the type of content on the website and may also depend on its PageRank. Static pages providing rarely-updated information may get a visit every other month or so, and a popular blog with many posts a day could get crawled several times a day. (although in the case of a blog, usually the blogging software will ping search engines, and so are crawled on-demand)

It appears that those are forum posts on a moderate-traffic site, so it should get crawled a few times a week. Even my own website, which currently has a under 8,000,000 ranking on Alexa, gets crawled every week to every other week with a near-daily robots.txt request.

Pages with similar content should automatically get grouped together, but if it isn't, try the rel="canonical" tip given by the other answerers.

@Chris: No, that setting does not change how often your site is crawled, only how fast Google requests the pages during the crawl. It's a misleading setting, and a lot of people make that mistake, even though the help pages clearly indicate this.

thezachperson31
A: 

my site gets crawled but not recursively for some reason.. http://openbible.no-ip.org/openbible/bible-feed

nepoez