tags:

views:

150

answers:

5

I have the following:

string html_string = "http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=pharma";
string html;
html = new WebClient().DownloadString(html_string);

and when I get the length of HTML, it's returning only the first 28435 characters.

Is it possible that Google is not allowing webclient access?

+2  A: 

I've tried this snippet and it returned exactly the same HTML as returned by a browser. The only correction I would make is to dispose disposable objects:

string html_string = "http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=pharma";
using (var client = new WebClient())
{
    string html = client.DownloadString(html_string);
}
Darin Dimitrov
thank you very much works great!
I__
+3  A: 

No, see the TOS

5.3 You agree not to access (or attempt to access) any of the Services by any means other than through the interface that is provided by Google, unless you have been specifically allowed to do so in a separate agreement with Google. You specifically agree not to access (or attempt to access) any of the Services through any automated means (including use of scripts or web crawlers) and shall ensure that you comply with the instructions set out in any robots.txt file present on the Services.

toscho
Why the downvoting without any explanation? My answer addresses exactly the question, doesn’t it?
toscho
+1  A: 

From experience for search results, they can and will shut you down they detect a robot.

kenny
+2  A: 

If you're writing a bot, it won't work; they'll eventually block you.

You might want to look at their list of API's, especially Custom Search, and see if that helps?

Dean J
A: 

It will of course be different. There may be lots of extra chars stuffed in the browser which deal with user logged in, bots code, and many other scripts.

When you fetch data through the code, the search will be performed as a non google user (or un-signed user, if you like). This is the easiest explanation.

I am afraid Darin's answer will not work, at least, all the time. It's not fail-proof.


Yes, of course, your activity will be detected as bot and not human. So, beware of consequences.

Nayan