ansaurus

Question

Returning only No. of Google Search Results via Python

Answer 1

A:

You can use urllib for downloading the site and HTMLParser to parse out the <div id="resultStats">....</div> values. Here is an example:

http://stackoverflow.com/questions/3276040/how-can-i-use-the-python-htmlparser-library-to-extract-data-from-a-specific-div-t

zoli2k 2010-07-26 11:47:23

it's worth a mention that you'll have to spoof the browser agent id when using urllib - and Google frowns upon automated queries...

Wayne Werner 2010-07-26 12:27:37

Thanks, this answer also helps me solve something else I was stuck on.

subiet 2010-07-28 04:10:24

Answer 2

A:

Take a look at Alex Martelli's example.

If you search for something vague like "cars", data will look something like the following. Notice that it isn't very long; you only get the top few hits, and a link to "moreResultsUrl". Therefore, it should be reasonably fast to make this query and look in data['cursor']['estimatedResultCount'] for the estimated number of hits.

{'cursor': {'currentPageIndex': 0,
            'estimatedResultCount': '168000000',
            'moreResultsUrl': 'http://www.google.com/search?oe=utf8&amp;ie=utf8&amp;source=uds&amp;start=0&amp;hl=en&amp;q=cars',
            'pages': [{'label': 1, 'start': '0'},
                      {'label': 2, 'start': '4'},
                      {'label': 3, 'start': '8'},
                      {'label': 4, 'start': '12'},
                      {'label': 5, 'start': '16'},
                      {'label': 6, 'start': '20'},
                      {'label': 7, 'start': '24'},
                      {'label': 8, 'start': '28'}]},
 'results': [ <<list of 4 dicts>> ]}

unutbu 2010-07-26 11:52:22

ansaurus

tags:

views:

answers:

Returning only No. of Google Search Results via Python

related questions