views:

232

answers:

2

I'm specifying a content source as an internal non sharepoint web site

e.g. http://internal.example.com

However a full crawl only ever crawls 22 pages in the root of the web site (there are 100), unless i specify 'Custom - specify page depth and server hops:' and leave the 'Page Depth' and 'Limit Server Hops' as Unlimited

Which then makes the crawl go mental!

Update: I'm using MS Search Server Express 2008

A: 

You might want to specify exactly what tool/technology you are using to do this crawling. Also, have you tried something other than unlimited, and what are the results?

larson4
+1  A: 

Wget is pretty smart. Here is a command line I use to recursively snapshot sites.

wget -r -k -K --no-parent http://internal.example.com/

http://gnuwin32.sourceforge.net/packages/wget.htm

Joe Koberg