tags:

views:

43

answers:

2

I need to make a script which auto-increments an xml sitemap (for use by search engines) every time a new ad is created on my site (classifieds site using php and mysql).

I have got stuck at how to auto-increment the xml site map. Every site map can contain a maximum of 50000 records of URLS.

Besides, whenever a user deletes their ad (for example after selling the item), I need this URL inside the sitemap to get deleted also.

I already have a script which generates xml site maps from my database, BUT, it will overwrite the xml sitemaps and create everything everytime a user posts an ad.

Is it even possible to edit an xml file with PHP at this level? For example, if I could read how many lines there are in an xml file, I would know where to set the limit (50000) and create a new one. Also, if I could read xml files and search for lines, I could also delete ads.

But is that possible?

Code snippets or what methods to use is appreciated!

Thanks

+1  A: 

You could simply use SimpleXML to open the sitemap and then do the following:

  1. Iterate the elements
  2. If you find the element, update it (url, last changed, etc.)
  3. If you dont find it append it.

Would of course have to be modified a bit for the multiple-sitemap situations. Furthermore you could use some XPath to search your files. Notice, however, that doing this kind of XML work can be quite slow.

I therefore think you should consider the possibility of regenerating your entire sitemap at regular intervals (say every 12 or 24 hours), because the search engines will be fetching your sitemap very rarely.

phidah
interesting... Tell me, do I need to submit my sitemap to google every time it is updated? or does one time suffice, and the google will always update it automatically?
Camran
submitting the sitemap once is fine. Then Google will fetch the sitemap at regular intervals to keep up.
phidah
Google will actually try to detect a sitemap even if it's not submitted, much like it will spider pages whether or not they are in a sitemap.Also, some sitemap generators add a link to the sitemap in the robots.txt file
adam
Sure, but Google will find it faster if you submit it. Furthermore, you'll be able to see some nice statistics and do some debugging via Google Webmaster Tools. But that is not relevant to the question; what's relevant is that no matter how often you regenerate your sitemap, Google will only fetch it at their own tempo.
phidah
A: 

Considering the overhead of adding to or deleting from this file each time an ad is added/deleted, I'd stick with your existing script (which rebuilds the sitemap from scratch) and set it to run once every night, at say midnight. You won't be losing out, as the search engines won't fetch your sitemap more than once a day at most.

adam
same follow-question as above: do I need to submit my sitemap to google every time it is updated? or does one time suffice, and the google will always update it automatically?
Camran