views:

505

answers:

9

I have a question regarding the performance of the .Net HttpWebRequest client (or WebClient, gives similar results).

If I use HttpWebRequest to request an html page (in this case news.bbc.co.uk) and analyse the speed (using HttpAnalyzer) at which the response is read by the application, this is significantly slower than a browser (Firefox, Chrome, IE) requesting the same resource (all caches cleared etc). The .Net application takes approximately 1.7 seconds versus 0.2 - 0.3 seconds for a browser.

Is this purely down to the speed and efficiency of the code / application or are there any other factors to consider?

Code as follows:

HttpWebRequest request = null;

Uri uriTest = new Uri("http://news.bbc.co.uk");

request = (HttpWebRequest)WebRequest.Create(uriTest);

request.Method = "GET";
request.KeepAlive = true;
request.Headers["Accept-Encoding"] = "gzip, deflate";

HttpWebResponse response = (HttpWebResponse)request.GetResponse();

response.Close();
+1  A: 

Have you watched the network while using the browser? Perhaps the browser is using cached resources?

John Saunders
I've confirmed that the browser is not using cached resources.
Chris
I saw that part of your question. I meant the inverse - have you confirmed what _is_ actually coming across the wire? Compare _that_ between the two scenarios.
John Saunders
Also, you're only counting until the response. You should get `response.GetResponseStream()` and read the whole thing.
John Saunders
A: 

I'm seeing the exact same thing as you. It's a few seconds to load the page in the browser (nothing cached, never visited the page before), yet I'm seeing ~17 s to get a response using your code (I used Stopwatch to measure the time it took to get a response). If you debug your app, you'll see something like this:

'ConsoleApplication1.vshost.exe' (Managed): Loaded 'cmJPpSOq'
'ConsoleApplication1.vshost.exe' (Managed): Loaded 'cmJPpSOq.dll' // this is where it waits nearly 15 s every time
'ConsoleApplication1.vshost.exe' (Managed): Loaded 'JScript Thunk Assembly'
'ConsoleApplication1.vshost.exe' (Managed): Loaded 'JScript Thunk Module'

This part of the code seems corresponds to HttpWebResponse response = (HttpWebResponse) request.GetResponse(). I've used something similar to this in my code and I've experienced the same slowdown. I've yet to find a good explanation for this.

alex
+1  A: 

I'd jack Fiddler in the middle, run the browser request and the .NET request one after the other and make sure you're really getting what you think. It's possible there's redirection or something else hinky going on (maybe browser is pre-appending the '/' while .NET waits for the redir, etc) that isn't immediately visible. I've built huge apps on the .NET HTTP client with nothing like what you describe- something else must be going on.

What happens if you stick '/' on the end of the URL?

nitzmahone
Adding '/' makes no difference. Will use Fiddler to confirm timings..
Chris
+1, Fiddler will show you what's transferring, if it's gzip'd, etc.
orip
A: 

It maybe that bbc.co.uk checks the User-Agent header that is being passed to it and handles the response based on that. So if it sees automated clients then it responds slowly, where as if it believes that there is a real person at the end of the line then it speeds up. If you really want to try it out just tell the HttpWebRequest to pass a different header.

Andrew Cox
Added the same User-Agent as Firefox with the same results..
Chris
+1  A: 
Maxwell Troy Milton King
+1  A: 

If you make two requests does the second one happen more quickly?

I have also notice speed disparities between browsers and WebClient or WebRequest. Even the raw speed of the response can be drastically different - but not all the time!

There are a few things this could be caused by:

  • It could be all the .Net bootstrapping that happens. .Net assemblies aren't loaded and JITted until they are used, therefore you can see significant speed degradation on the initial call to a piece of code even if the application itself has been running for ages. Okay - so the .Net framework itself is nGen'd - but there's still the bridge between your code and the .Net framework to build on the fly.

  • Just checking that you're running without the debugger attached and that you definitely don't have symbol server switched on - symbol server and VS interrupts programs as the symbols are downloaded, slowing them down bucket-loads. Sorry if this is an insult ;)

  • Browsers are coded to make efficient use of only a few underlying sockets; and they will be opened and primed as soon as the browser is there. 'Our' code that uses .Net WebClient/WebRequest is totally inefficient in comparison, as everything is initialised anew each time.

  • There are a lot of platform resources associated with networking, and whilst .Net makes it much easier to code with networks, it's still bound to the same platform resource issues. Ergo, the closer you are to the platform you are, the faster some code will be. IE and Firefox et al are native and therefore can thrown around system resources natively; .Net isn't and therefore some marshalling(=slow) is required to set things up. Obviously, once a port is opened and being used, however, .Net is still no slouch; but it almost would never be as fast as well-written non-marshalled native code.

Andras Zoltan
+1  A: 

Run the application with Ctrl+F5 instead of F5 (Debug mode). You will see a difference:

class Program
{
    static void Main()
    {
        using (var client = new WebClient())
        {
            Stopwatch watch = Stopwatch.StartNew();
            var data = client.DownloadData("http://news.bbc.co.uk");
            watch.Start();
            Console.WriteLine("{0} ms", watch.ElapsedMilliseconds);
        }
    }
}

Prints 880 ms on my PC.

Darin Dimitrov
A: 

Whenever you measure anything, you have to account for the startup costs. If your .net code is in a single process,and you are only measuring the single request, then your measurement will be tainted by first time costs of initializing assemblies, types, etc.

As Darin and others have suggested, you should make sure that:

1) You are not running the process under debuggger. 2) You account for startup costs.

One way you can do #2, is to make two requests and only measure the second one. Or you can make N requests, discard the 1st one, and get the average of last N-1 requests. Also make sure that you read the entity stream.

feroze
A: 

The first time you request a page, .net tries to detect proxy settings. The solution is to pass in an empty WebProxy object. This way it just connects to remote server instead of autodetecting the proxy server.

HttpWebRequest request = (HttpWebRequest)WebRequest.Create(uriTest);
request.Proxy = new WebProxy();
Markos