ansaurus

Question

Why does Google Search return HTTP Error 403?

Answer 1

A:

You're doing it too often. Google has limits in place to prevent getting swamped by search bots. You can also try setting the user-agent to something that more closely resembles a normal browser.

Joel Coehoorn 2009-03-01 21:20:45

I have only tried twice today.

AgentLiquid 2009-03-01 21:21:35

Wrong answer. It blocks on the first attempt.

nosklo 2009-03-01 21:28:23

that's right user-agent makes all the difference.

Evgeny 2009-08-23 07:17:20

Answer 2

+14 A:

If you want to do Google searches "properly" through a programming interface, take a look at Google APIs. Not only are these the official way of searching Google, they are also not likely to change if Google changes their result page layout.

lacqui 2009-03-01 21:22:09

Do you have idea what's going on under the hood though? I'm curious ... why doesn't url.read() look like a standard browser read?

AgentLiquid 2009-03-01 21:24:46

what sort of moron would vote this post "offensive"?

Paul Tomblin 2009-03-01 21:27:58

Instead of going through the web interfaces, these APIs directly access the search XML. They connect to a different page at Google, which gives you data in a different format.Basically, you were getting 403 because you weren't allowed to access the data the way you were, and Google knew it (...)

lacqui 2009-03-01 21:28:38

(...) because your app either (a) didn't send a User-Agent string or (b) sent a default one that Google recognized as a robot (see http://google.com/robots.txt)

lacqui 2009-03-01 21:29:30

Awesome explanation, thank you.

AgentLiquid 2009-03-01 21:46:13

The problem with their api's are that they don't return the same results as google.com. See http://code.google.com/p/google-ajax-apis/issues/detail?id=43

Anders Rune Jensen 2010-05-22 22:10:14

Answer 3

+4 A:

this should do the trick

user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7'

url = "http://www.google.com/search?hl=en&amp;safe=off&amp;q=Monkey"
headers={'User-Agent':user_agent,} 

request=urllib2.Request(url,None,headers) //The assembled request
response = urllib2.urlopen(request)
data = response.read() // The data u need

2009-05-12 20:46:05

Could you please format your code? (Just select it and press ctrl-k.)

Stephan202 2009-05-12 20:52:10

ansaurus

tags:

views:

answers:

Why does Google Search return HTTP Error 403?

related questions