views:

661

answers:

3

I'm making a program that connects to a website and downloads XML from it. It then displays the information to the user.

The problem I am having is when I first open the program and start downloading the XML information it takes a really long time. When I load another page from the site with the program still open, it takes about half a second to download. I was wondering if there was any way to avoid this.

I currently use an HttpWebRequest to download the stream and a StreamReader to read it. Then I go through and parse the XML using XLINQ.

A: 

You would probably have to do some more research to figure out what part of the request is taking longer on the first pass. My first instinct says that the DNS request to get the IP address for the domain name you specify is taking longer, because it isn't cached the first time it runs. It could also be the web server on the other end that has to run some start-up scripts the first time you query it. You mentioned that the first request takes a long time, but you don't say how long. Is this causing a big problem that it takes so long to do the first request, or is it just an annoyance?

Kibbee
See my comments above in reply to Guy Starbuck. As for the how long, it varies but usually anywhere from 15 to 45 seconds. It's not a big problem but it is an annoyance, especially when I port the program to Windows Mobile.
Christian
"15 to 45 seconds" there is virtually no way a DNS lookup is taking that long every time your app starts up.
Rex M
+7  A: 

Try explicitly setting the proxy. If you don't have a proxy defined, the HttpRequest class will spend time searching for one. Once it has (or hasn't) found one, it will use that information for the life of the application.

//internally sets "ProxySet" to true, wont search for a proxy
request.Proxy = null;

Also can use:

<system.net>
  <defaultProxy
    enabled="false"
    useDefaultCredentials="false" >
    <proxy/>
    <bypasslist/>
    <module/>
  </defaultProxy>
</system.net>
Rex M
Based on what he's said so far this sounds like the best explanation. Curious to see what Christian finds.
JoshBerke
This worked! Thank you so much :)
Christian
+1  A: 

The first time delay can be due to a combination of the following:

  1. Time to resolve the server DNS entry
  2. Time to download the proxy autoconfig script, compile and execute it to determine the effective proxy
  3. network latency from your app to the proxy server (if there is a proxy server in your environment)
  4. network latency from the proxy server to the actual destination server.
  5. The latency on the server to serve the XML document. If it has to traverse an in-memory object representation and generate the XML document, that might take some time. Also, if it is using techniques like XML-Serialization to generate the document, then depending on how the serializer is configured, the first call to serialize/deserialize always takes a long time, due to the fact that an intermediate assembly needs to be generated and compiled.
  6. Parsing the XML on the client side might take time, esp if the XML document structure is very complex.
  7. If XLinq (like the XMLSerializer) generates temp assembly for the XML parsing & querying, then the first request will take more time than the subsequent ones.

To figure out which part is taking time, insert some time logging into your code using System.Diagnostics.Stopwatch():

// this is the time to get the XML doc from the server, including the time to resolve DNS, get proxy etc.
System.Diagnostics.Stopwatch timer = new System.Diagnostics.Stopwatch();
timer.Start();
HttpWebResponse resp = (HttpWebResponse)request.GetResponse();
timer.Stop();
Console.WriteLine("XML download took: " + timer.ElapsedMilliseconds);

timer.Start();
// now, do your XLinq stuff here...
timer.Stop();
Console.WriteLine("XLinq took: " + timer.ElapsedMilliseconds);

You can insert a loop around this, and see what the difference for the various components between the first request and subsequent requests is.

If you find that the difference is in the downloading, and not the querying, then you can investigate further by getting a network sniff using Wireshark.

Hope this helps.

feroze