We want to implement a sitemap.xml feature in out CMS system. There is some argument within our developers that this feature will impact performance due to each time a change is done in the Content a full list of links of the site needs to be created and placed in the sitemap.xml.

The idea is that each time a public viewing page is edited or added it is immediately added to the sitemap.xml making it up to date with the site.

While you are answering, and if you have time, what other CMS system open or not have built-in sitemap generation?



Updating the sitemap every time you update your CMS will definitely create performance issues, because sitemaps tend to be large, and costly to generate (from a CPU & disk i/o perspective).

What I would do is:
1. map out the structure of your site
2. determine which areas you need to link to in the sitemap
3. add the name of the sitemap index to your robots.txt file
4. write a script that will read from the database and generate static xml sitemap files
5. create a cron job that will re-run this script on a regular basis
6. submit your sitemap url to the search engines


Bearing in mind google doesn't read your sitemap that often, its safe to re-generate on a cron job every day, so if you schedule a rebuild of it each evening in the quiet hours, google will pick the changes up next time you poll.


For the CMS-powered sites I work on, with 70,000 to 350,000 pages/folders each, we typically regenerate the sitemap XML once every 24 hours. We've never had any problems with that. Unless your site is as popular as Stackoverflow - and Google recognizes that it gets updated as much as SO - it won't re-crawl your site often enough to justify having a fully updated sitemap file.

Rex M