views:

148

answers:

3

I'm trying to access information on a webpage. First time I've done this. The problem with this is that it is too slow. Doing this only on one page, that loads very fast on a browser, but takes forever here. Only thing I need here is the HTML behind the page, so I got to ask, is my code in some way downloading the images? Any help would be great to speed up this process.

        string url;

        HttpWebRequest pedido = (HttpWebRequest)WebRequest.Create(url);

        HttpWebResponse resposta = (HttpWebResponse)pedido.GetResponse();
        //On the line above it takes forever to load.

        StreamReader SR = new StreamReader(resposta.GetResponseStream());


        string html;
        string tituloTemp = "";

        do
        {
            html = SR.ReadLine();
            if (html.Contains("<title>"))
                tituloTemp = html;

        } while (!(html.Contains("<title>")));
        SR.Close();
+1  A: 

Check the transaction with Fiddler.

It could be a DNS inquiry that's timing out, or an authentication challenge. With Fiddler you'll be able to see the timing breakdown for both the browser and the application transctions. All will become clear.

Cheeso
A: 

Your problem is most likely the ReadLine(), which only returns when it hits a newline. If the page doesn't have a newline then you're probably seeing a timeout. You also are scanning for"" twice, so you should rethink your parsing approach.

I recommend you read the entire response into memory and then parse that for your tags. These links speak to C# parsers that might give you a more robust solution:

http://stackoverflow.com/questions/100358/looking-for-c-html-parser

http://stackoverflow.com/questions/56107/what-is-the-best-way-to-parse-html-in-c

ebpower
A: 

I actually found the answer here at stack overflow: Here is the link: http://stackoverflow.com/questions/901323/c-httpwebresponse-streamreader-very-slow/1950692#1950692

elvispt