views:

95

answers:

3

The text below is from sitemaps.org. What are the benefits to do that versus the crawler doing their job?

Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site.

Edit 1: I am hoping to get enough benefits so I canjustify the development of that feature. At this moment our system does not provide sitemaps dynamically, so we have to create one with a crawler which is not a very good process.

A: 

Well, like that paragraph says sitemaps also provide meta data about a given url that a crawler may not be able to extrapolate purely by crawling. The sitemap acts as table of contents for the crawler so that it can prioritize content and index what matters.

Nathan Taylor
+1  A: 

Crawlers are "lazy" too, so if you give them a sitemap with all your site URLs in it, they are more likely to index more pages on your site.

They also give you the ability to prioritize your pages so the crawlers know how frequently they change, which ones are more important to keep updated, etc. so they don't waste their time crawling pages that haven't changed, missing ones that do, or indexing pages you don't care much about (and missing pages that you do).

There are also lots of automated tools online that you can use to crawl your entire site and generate a sitemap. If your site isn't too big (less than a few thousand urls) those will work great.

Eric Petroelje
A: 

The sitemap helps telling the crawler which pages are more important, and also how often they can be expected to be updated. This is information that really can't be found out by just scanning the pages themselves.

Crawlers have a limit to how many pages the scan of your site, and how many levels deep they follow links. If you have a lot of less relevant pages, a lot of different URLs to the same page, or pages that need many steps to get to, the crawler will stop before it comes to the most interresting pages. The site map offers an alternative way to easily find the most interresting pages, without having to follow links and sorting out duplicates.

Guffa