views:

131

answers:

1

Apologies if this is too ignorant a question or has been asked before. A cursory look did not find anything matching this exactly. The question is: how can I download all Word documents that Google has indexed? It would be a daunting task indeed to do it by hand... Thanks for all pointers.

+2  A: 

I'm afraid, there is no legal way to do it. Formerly Google supplied a SOAP API to their websearch but it's deprecated and to be closed this summer. It had a limitation of 1000 queries a day.

Currently Google provides an Ajax Search API but it brings no solution for you as the largest result set contains 8 results.

And finally, there is the standard webform at google.com which is prohibited to query programmatically. (And there is also a limitation that Google only returns the first thousand results, you cannot see more.)

If you want to build a service on this, you can contact Google and make a partnership with them.

Török Gábor