views:

43

answers:

2

I've an application that create many web request to donwload the news pages of a web site (i've tested for many web sites) after a while I find out that the application slows down in fetching the html source then I found out that HttpWebResonse fails getting the response. I post only the function that do this job.

    public PageFetchResult Fetch()
    {
        PageFetchResult fetchResult = new PageFetchResult();
        try
        {
            HttpWebRequest req = (HttpWebRequest)HttpWebRequest.Create(URLAddress);
            HttpWebResponse resp = (HttpWebResponse)req.GetResponse();
            Uri requestedURI = new Uri(URLAddress);
            Uri responseURI = resp.ResponseUri;
            if (Uri.Equals(requestedURI, responseURI))
            {
                string resultHTML = "";
                byte[] reqHTML = ResponseAsBytes(resp);
                if (!string.IsNullOrEmpty(FetchingEncoding))
                    resultHTML = Encoding.GetEncoding(FetchingEncoding).GetString(reqHTML);
                else if (!string.IsNullOrEmpty(resp.CharacterSet))
                    resultHTML = Encoding.GetEncoding(resp.CharacterSet).GetString(reqHTML);

                resp.Close();
                fetchResult.IsOK = true;
                fetchResult.ResultHTML = resultHTML;
            }
            else
            {
                URLAddress = responseURI.AbsoluteUri;
                relayPageCount++;
                if (relayPageCount > 5)
                {
                    fetchResult.IsOK = false;
                    fetchResult.ErrorMessage = "Maximum page redirection occured.";
                    return fetchResult;
                }
                return Fetch();
            }
        }
        catch (Exception ex)
        {
            fetchResult.IsOK = false;
            fetchResult.ErrorMessage = ex.Message;
        }
        return fetchResult;
    }

any solution would greatly appreciate

A: 

Fetch function is called recursively and always creates HttpWebRequest but releasing only when url is matched. You have to close request and response in else statement.

volody
A: 

I agree with @volody, Also HttpWebRequest already have property called MaximumAutomaticRedirections, which is set to 50, you can set it to 5 to automatically achieve what you are looking for in this code anyway, it will raise exception and that will be handled by your code.

Just set

request.MaximumAutomaticRedirections  = 5;
Akash Kava
Thanks for your comments (volody, akash kava), I've fixed my code but this is not my problem (the redirection is not my point cause it's never happens its just a preventive code)this application fetching around 500 web pages and then it suddenly hangs I don't know exactly what is come from.any other solution ??
Ehsan
That could be issue with threading, and how do you manage your threading. You should try .net 4.0 's parallel extensions, that might help as your code may have some deadlock which parallel extension can provide same logic with improved threading. Plus, it could be your firewall blocking it thinking it to be some sort of attack, and also every domain only allows max 2 http connections simultaneously, so i think you should group your requests by domains and queue them for one domain, that might help.
Akash Kava