Short story:
My site pre generates pages based on user submited data, sometimes this cache has to be cleared when this happens it would kill a super computer unless i controled the amount of stats being generated at once.
The problem:
Now comes the search engine bots that hit the site constantly ( due to the sheer amount of pages, its pretty constants that search engines bot crawl ). The problem here is that they will use up all my "generate" slots, and real users will be left with a page saying "bla bla, please wait".
Posible solution:
Can i basicly return a 503 to the bots, without having them give me negative ranking for having a unstable site?
Or did someone come up with some other solution?
views:
71answers:
1
+1
A:
How critical is it that the cache is cleared immediately? If your cache supports it, you could instead mark all the cached pages as 'dirty' and regenerate them when a real user next visits; if a bot visits in the meantime, serve them the stale page.
Andrew Aylett
2009-12-18 13:48:47
Basicly the cache becomes outdated/dirty, when i do big site upgrades that requires more information in the cache. I have considered basicly having 2 cache tables, and generate up the new cache before i publish the new site version.
EKS
2009-12-18 13:56:25