Apologies if this is too ignorant a question or has been asked before. A cursory look did not find anything matching this exactly. The question is: how can I download all Word documents that Google has indexed? It would be a daunting task indeed to do it by hand... Thanks for all pointers.
+2
A:
I'm afraid, there is no legal way to do it. Formerly Google supplied a SOAP API to their websearch but it's deprecated and to be closed this summer. It had a limitation of 1000 queries a day.
Currently Google provides an Ajax Search API but it brings no solution for you as the largest result set contains 8 results.
And finally, there is the standard webform at google.com which is prohibited to query programmatically. (And there is also a limitation that Google only returns the first thousand results, you cannot see more.)
If you want to build a service on this, you can contact Google and make a partnership with them.
Török Gábor
2009-05-24 14:32:03