There is a site that I want to retrieve from Google Cache that had thousands of pages. Is there any way I can get it back quickly using Google Cache or some other web crawler/archiver?
+1
A:
You can see what Google (still) knows about a website by using a site
restrict:
http://www.google.com/search?q=site:[domain]
You might also check out the Internet Archive.
(In either case, you’d probably want to do some heavy-duty automating to fetch thousands of pages.)
I was going to use Warrick: http://warrick.cs.odu.edu/But alas, its servers are too busy. Internet Archive saves after 6 months.
stockoverflow
2010-08-08 17:20:37