views:

552

answers:

3

I am working on an ASP.NET 3.5 Web Application project in C#. I have manually added a Google-friendly sitemap which includes entries for every page in the project - this is not a CMS.

  <url>
    <loc>http://www.mysite.com/events.aspx&lt;/loc&gt;
    <lastmod>2009-11-17T20:45:46Z</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.8</priority>
  </url>

The client updates events using an admin back-end. Other than that, the site is relatively static. I'm trying to decide on the best way to update the <lastmod> values for a handful of pages that are regularly updated.

In particular, I am using the QueryStringField of the ListView control to enhance SEO as described here:

http://www.4guysfromrolla.com/articles/010610-1.aspx

http://gsej.wordpress.com/2009/05/31/using-a-datapager-with-both-a-querystringfield-and-renderdisabledbuttonsaslabels/

When the QueryStringField property is set, the DataPager renders the paging interface as a series of hyperlinks which the crawler can follow and index. However, if Google has crawled my list of events two days ago, and in the meantime, the admin has added another dozen events... say the page size is set to 6; in this case, the Google SERP links would now be pointing to the wrong pages. This is why I need to be sure that the sitemap reflects changes to the events page as soon as they happen.

I have already looked though other SO questions for info and didn't find what I needed. Can anyone offer some guidance or an alternative approach?

UPDATE:

Since this is a shared hosting environment, a directory watcher/service won't work:

http://stackoverflow.com/questions/781927/how-to-create-file-watcher-in-shared-webhosting-environment

UPDATE:

Starting to realize that I may need signify to Google that the containing page has been updated; update the last-modified HTTP header?

+1  A: 

Hi

If you let your user add events to the website you are probably using a database. This means you can generate the XML-Sitemap at runtime like this:

  • create a page where your sitemap will be available (this doesn't need to be sitemap.xml but can also be sitemap.aspx or even sitemap.ashx).
  • open a database connection
  • loop through all records and create an Xml Element for each record

This blog post should help you further: Build a Search Engine SiteMap in C#. It is not using the new XElements from .Net 3.5, but is will work fine.

You can put this in an aspx page, but adding an HttpHandler is probably better as described on the same blog, different post: (creating a httphandler for a sitemap)

dampee
+1 as you were also "on the money". Why it doesn't need to be exactly "Sitemap.xml" needs some more clarification because I thought that was what Google expects?
IrishChieftain
No, this is not what google expects by default. There are several ways to dispose a sitemap to search engines. You can find more information on the sitemap-protocol website: http://www.sitemaps.org/protocol.php#informing
dampee
+1  A: 

Rather than using a hand-coded sitemap, create a sitemap handler that will generate the sitemap on the fly. You can create a method in the handler that will grab pages from an existing navigation sitemap, from the database, or even from a hard-coded list of pages. You can create an XmlDocument from the list, and write the InnerXml of the document out to the handler response stream.

Then, create a class with a method that will automatically ping search engines with the above handler's URL (like http://www.google.com/webmasters/tools/ping?sitemap=http://www.mysite.com/sitemap.ashx).

Whever someone adds a new event, call the above method. This will ping Google using your latest sitemap (freshly generated by the above method).

You want to make sure that the ping only works if the sitemap has actually been updated. You could use File.SetLastWriteTime on events.aspx in the AddNewEvent handler to signify that the containing page has been updated.

Aslo, be careful to make sure there have been no pings for the last hour (as Google guidelines discourage pinging more than once per hour).

I actually plan to implement this in the following OSS project: http://cyclemania.codeplex.com. I will let you know once it's done and you can have a look.

MissingLinq
Excellent response. I have a moderation system whereby the admin must set a check box to put the event live, so the method will be called there. +1 for the heads-up on the hourly thing; can feel an extra table coming on:-O
IrishChieftain
As promised, I've finished integrating this into Cyclemania. For the time being the ping method is being called manually via the admin tool (it could be called from other methods, of course).The implementation also features logging to a database table (as you alluded to) and a ping delay of 60 minutes (configurable). The data layering is kinda quick and dirty and uses ADO for now. ;)Note the robots.txt file, as the sitemap is also referenced there. Google is now able to pick up sitemap locations from robots.txt.
MissingLinq
A: 

I just hit an aspx page, which generates the sitemap.xml file and also submits it to Google, Yahoo!, Bing, and Ask. I chose an aspx file over a handler for 2 reasons:

  1. Not all versions of IIS allow you to override the .xml file handler because it is not served through the ASP.NET engine.

  2. For incredibly large sites, generating a sitemap can sometimes take a while and require alot of system resources to do so. So I want to generate a sitemap when I want to, not on demand.

Here is a more extensive example of how I do my sitemaps. Hope this helps

How-to Generate and Submit Your Sitemap to Google, Yahoo!, Bing, and Ask

damstr8