views:

27

answers:

2

I am trying to run a simple code for web crawler written in this page .
every thing is fine and I tried the program on several sites and it works fine but there is one site instead of returning the html content in its pages it generates a srtange error :

DotNetNuke Error: - Version 04.05.01 Return to main page

and the html returned is :

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"&gt;
<html lang="en-US">
<head>
    <title id="Title">Error</title>
    <link id="StyleSheet" href="/Install/Install.css" type="text/css" rel="stylesheet"></link>
</head>
<body>
    <form name="Form" method="post" action="ErrorPage.aspx?tabid=186&amp;error=Object+reference+not+set+to+an+instance+of+an+object.&amp;content=0&amp;language=ar-SY" id="Form">
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/wEPDwUJNTkzNjY2ODU2D2QWBAIDDxYCHgRocmVmBRQvSW5zdGFsbC9JbnN0YWxsLmNzc2QCBQ9kFgICAg8PFgIeBFRleHQFRDxpbWcgc3JjPSIvaW1hZ2VzL2x0LmdpZiIgYm9yZGVyPSIwIiAvPiDYsdis2YjYuSDYp9mE2Ykg2KfZhNmF2YjZgti5ZGRk2aDp+vZbUIDHSd3beGBaLQrJ6yk=" />

        <table cellspacing="5" cellpadding="5" border="0" class="Error">
            <tr>
                <td><img id="Image1" src="logo.gif" alt="DotNetNuke" border="0" /></td>
            </tr>
            <tr style="height:100%;">
                <td valign="top" style="width:650px;">
                    <h2>DotNetNuke Error: - Version 04.05.01</h2>
                    <hr />
                    <p>
<table border="0" cellspacing="0" cellpadding="4">
    <tr>
        <td valign="top" align="left"><img id="ctl00_imgIcon" src="images/red-error.gif" border="0" /></td>
        <td valign="middle" align="left"><span id="ctl00_lblHeading" class="NormalRed">an error has been occurred<br/></span><span id="ctl00_lblMessage" class="Normal">return to the site.</span></td>
    </tr>
</table>
<hr noshade size="1"/></p>
                </td>
            </tr>
            <tr>
                <td align="right"><a id="hypReturn" class="WizardButton" href="Default.aspx"><img src="/images/lt.gif" border="0" /> return to the site</a></td>
            </tr>
            <tr><td height="10px"></td></tr>
        </table>
    </form>
</body>
</html>

so what is DotNetNuke Error and what is the problem...BTW the error occurred in an Arabian site and I tried on another Arabian sites and there were no errors.

+1  A: 

It seems that your crawler has generated a request that caused DotNetNuke to crash. You probably are requesting a page that does not exist or pass request parameters that cause DotNetNuke to crash.

Just treat this result as a failed request.

Andreas Paulsson
thanks a lot!..I found a cached version of the site that contains the same results returned by the crowlere here : http://www.rankiva.com/cache/www.syriatel.sy/ Can I override this cached version and crawl the original site with HttpWebRequest ? it seems like the Request is going to the cached version!
fadi
thank you very much , I've solved the problem
fadi
+1  A: 

I've solved the problem by adding the UserAgent property :

hrqURL.UserAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)";
fadi