views:

1259

answers:

2

A friend accidentally deleted his forum database. Which wouldn't normally be a huge issue, except for the fact that he neglected to perform backups. 2 years of content is just plain gone. Obviously, he's learned his lesson.

The good news, however, is that Google keeps backups, even if individual site owners are idiots. The bad news is, that traditional crawling robots would choke on the Google Cache version of the website.

Is there anything existing that would help trawl the Google Cache, or how would I go about rolling my own?

+1  A: 

This article might be some help to your friend. http://www.smartmoneydaily.com/business/how-the-google-cache-can-save-your-a.aspx

Andy May
d8uv
+1  A: 

You may want to consider looking at crawling the archive.org cache as well. If you're in there, it's generally better structured.

singpolyma