tags:

views:

139

answers:

0

I am running into some weired issues with MOSS 2007 crawler not picking up some site collections. I have a couple of site collections using a custom site definition created under a managed path, for instance http: //server/clients/crawlthis

The search settings has start address for the content source set to http: //server/ with no crawl rule defined. The search account was assigned full read access to all of the sites and I can log in into these sites with the search account.

so when I kick off the crawl, it never picks up any content from any of the site collections and I am getting the following message in crawl log

http://server/clients/crawlthis
The object was not found. (The item was deleted because it was either not found or the crawler was denied access to it.)
http://server/clients/crawlthis
crawled

keep in mind that the site does exist and I can log in with the search account. After digging into the Sharepoint log files I can spot a few lines that might have something to do with this issue

CHttpAccessorHelper::InitRequestInternal - unexpected status (404) on request for 'http://server/clients/crawlthis/Pages/Home.aspx'. - File:d:\office\source\search\search\gather\protocols\http\httpacchelper.cxx Line:467 
CSTS3Accessor::Init fails, Url sts3://server/siteurl=clients/crawlthis/siteid={97e2c292-a53e-4364-90b8-eb6d3e08f73b}/weburl=/webid={49eba753-a186-4465-94fa-c1bb30ec5bc6}, hr=80041201 - File:d:\office\source\search\search\gather\protocols\sts3\sts3handler.cxx Line:238   

Just to make things more interesting, I created a test site with the same custom site definition without using the managed path http: //server/sites/crawltest. this site collection and its content all get crawled. I am assuming it is not related to the custom site definition.

I also have a staging environment with the same setup, i.e sites under /clients/ and they all get crawled fine.

I hope I have described my problem clear, Anybody have any pointers on this one?