views:

1558

answers:

6

I requested 100 pages that all 404. I wrote

    {
    var s = DateTime.Now;
    for(int i=0; i < 100;i++)
        DL.CheckExist("http://google.com/lol" + i.ToString() + ".jpg");
    var e = DateTime.Now;
    var d = e-s;
        d=d;
        Console.WriteLine(d);
    }

static public bool CheckExist(string url)
{
    HttpWebRequest wreq = null;
    HttpWebResponse wresp = null;
    bool ret = false;

    try
    {
        wreq = (HttpWebRequest)WebRequest.Create(url);
        wreq.KeepAlive = true;
        wreq.Method = "HEAD";
        wresp = (HttpWebResponse)wreq.GetResponse();
        ret = true;
    }
    catch (System.Net.WebException)
    {
    }
    finally
    {
        if (wresp != null)
            wresp.Close();
    }
    return ret;
}

Two runs show it takes 00:00:30.7968750 and 00:00:26.8750000. Then i tried firefox and use the following code

<html>
<body>
<script type="text/javascript">
for(var i=0; i<100; i++)
    document.write("<img src=http://google.com/lol" + i + ".jpg><br>");
</script>

</body>
</html>

Using my comp time and counting it was roughly 4 seconds. 4 seconds is 6.5-7.5faster then my app. I plan to scan through a thousands of files so taking 3.75hours instead of 30mins would be a big problem. How can i make this code faster? I know someone will say firefox caches the images but i want to say 1) it still needs to check the headers from the remote server to see if it has been updated (which is what i want my app to do) 2) I am not receiving the body, my code should only be requesting the header. So, how do i solve this?

+2  A: 

Probably Firefox issues multiple requests at once whereas your code does them one by one. Perhaps adding threads will speed up your program.

Artelius
Good point. Do sites accept more then 3 threads? That would explain why a site may be 3-4times faster but not more then 6.5. hmmm. I'll keep this in mind and try again tonight
acidzombie24
So i checked with another app and one site i tested could handle 8 thread. That would explain it. I'll be a little embarrassed if that was the only reason.
acidzombie24
Maybe firefox only makes 3 requests at once.
Artelius
+3  A: 

change your code to asynchronous getresponse

public override WebResponse GetResponse() {
    •••
    IAsyncResult asyncResult = BeginGetResponse(null, null);
    •••
    return EndGetResponse(asyncResult);
}

Async Get

CodeToGlory
Yes but you can do it in 1 line of code now too. WebClient.DownloadSctringAsync http://msdn.microsoft.com/en-us/library/ms144202(VS.80).aspx
Chad Grant
A: 

Have you tried opening the same URL in IE on the machine that your code is deployed to? If it is a Windows Server machine then sometimes it's because the url you're requesting is not in IE's (which HttpWebRequest works off) list of secure sites. You'll just need to add it.

Do you have more info you could post? I've doing something similar and have run into tons of problems with HttpWebRequest before. All unique. So more info would help.

BTW, calling it using the async methods won't really help in this case. It doesn't shorten the download time. It just doesn't block your calling thread that's all.

fung
I tried with IE6 and it takes roughly 5 seconds. My code using = wreq.Method = "HEAD"; takes 12.5. I'll assume its bc its using 2 threads. That data looks close enough.
acidzombie24
Noticed that you mentioned that your requesting pages that 404. Haven't done that on purpose before but WebRequest behaviour might be different for such cases. Something worth looking into. Does it take the normal 4 secs if you request for an existing page using a GET?
fung
+7  A: 

I noticed that an HttpWebRequest hangs on the first request. I did some research and what seems to be happening is that the request is configuring or auto-detecting proxies. If you set Proxy = null on the web request object, you might be able to avoid an initial delay.

Max
When i put this line in my code request.Proxy = null; i was able to get result instantly! Thanx
zidane
Great answer, it solved my problem, too.
Igor Brejc
Someone know... why this works??????????
Fraga
A: 

The answer is changing HttpWebRequest/HttpWebResponse to WebRequest/WebResponse only. That fixed the problem.

Alterin
A: 

close the response stream when you are done, so in your checkExist(), add wresp.Close() after wresp = (HttpWebResponse)wreq.GetResponse();

Sarvesh
Thats already done in the finally block
acidzombie24