views:

631

answers:

4

Hi

I am trying to get data from a web page using c#

So far this is my code:

WebBrowser wb = new WebBrowser();

wb.Url = new Uri("http://www.microsoft.com");
HtmlDocument doc = wb.Document;

MessageBox.Show(doc.ToString());

Unfortunately wb remains null and the Url property never gets set.

Can anyone help me please?

Thanks

A: 

i think you're forgetting to initiate the navigation to the page. See here

But also...

Navigation is an async process. You need to put the web browser object at class scope and then handle the navigated event like this:

    private void webBrowser1_Navigated(object sender, WebBrowserNavigatedEventArgs e)
    {
        HtmlDocument doc = webBrowser1.Document;
        doc = webBrowser1.Document;
    }

You'll see that doc will be non null in the handler.

Paul Sasik
nope :/ i added that before the wb.Url part and still no progress :/
Lily
i added some more info. i used it in my local test prj. It'll solve your issue.
Paul Sasik
A: 

Try the DocumentText property of the HtmlDocument, rather than calling ToString().

Ray
+5  A: 

I would use the WebClient class instead of the web browser. The web browser class is more for interaction with a UI, whilst the WebClient is more geared towards programmatic interaction with page. Here is some example code:

private void sendMessage(JaxtrSmsMessage message)
{
    HttpWebRequest request;
    HttpWebResponse response;
    CookieContainer cookies;
    string url = "http://www.jaxtr.com/user/login.jsp";

    try
    {
        request = (HttpWebRequest)WebRequest.Create(url);
        request.AllowAutoRedirect = true;
        request.CookieContainer = new CookieContainer();
        response = (HttpWebResponse)request.GetResponse();
        if (response.StatusCode == HttpStatusCode.OK)
        {
            StringBuilder sb = new StringBuilder();
            StreamReader reader = new StreamReader(response.GetResponseStream());
            while (!reader.EndOfStream)
            {
                sb.AppendLine(reader.ReadLine());
            }

            //Get the hidden value out of the form.                
            String fp = Regex.Match(sb.ToString(), "\"__fp\"\\svalue=\"(([A-Za-z0-9+/=]){4}){1,19}\"", RegexOptions.None).Value;
            fp = fp.Substring(14);
            fp = fp.Replace("\"", String.Empty);


            cookies = request.CookieContainer;
            //response.Close();
            String requestString = "http://www.jaxtr.com/user/Login.action?tzOffset=6&navigateURL=&refPage=&jaxtrId=" + HttpUtility.UrlEncode(credentials.Username) + "&password=" + HttpUtility.UrlEncode(credentials.Password) + "&Login=Login&_sourcePage=%2Flogin.jsp&__fp="+HttpUtility.UrlEncode(fp);
            request = (HttpWebRequest)WebRequest.Create(requestString);
            request.CookieContainer = cookies; //added by myself

            response = (HttpWebResponse)request.GetResponse();
            Console.WriteLine("Response from login:" + response.StatusCode);

            String messageText = (message.TruncateMessage && message.MessageText.Length > JaxtrSmsMessage.MAX_MESSAGE_LENGTH ? message.MessageText.Substring(JaxtrSmsMessage.MAX_MESSAGE_LENGTH) : message.MessageText);

            String messageURL = "http://www.jaxtr.com/user/sendsms?CountryName=" + HttpUtility.UrlEncode(message.CountryName) + "&phone=" + HttpUtility.UrlEncode(message.DestinationPhoneNumber) + "&message=" + HttpUtility.UrlEncode(messageText) + "&bySMS=" + HttpUtility.UrlEncode(message.BySMS.ToString().ToLower());

            request = (HttpWebRequest)WebRequest.Create(messageURL);
            request.CookieContainer = cookies;
            response = (HttpWebResponse)request.GetResponse();

            Console.WriteLine("Response from send SMS command=" + response.StatusCode);

            StringBuilder output = new StringBuilder();

            using (Stream s = response.GetResponseStream())
            {
                StreamReader sr = new StreamReader(s);
                while (!sr.EndOfStream)
                {
                    output.AppendLine(sr.ReadLine());
                }
            }
            response.Close();
        }
        else
        {
            Console.WriteLine("Client was unable to connect!");
        }
    }
    catch (System.Exception e)
    {
        throw new SMSDeliveryException("Unable to deliver SMS message because "+e.Message, e);
    }
}
Wayne Hartman
+1  A: 

Firstly you have to call navigate method with URL path and you have to trap Document complete event from the web browser control

 webBrowser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(webBrowser_DocumentCompleted);
 webBrowser.Navigate("http:\\www.microsoft.com");

and then use document complete event to get the loaded web page document

 void webBrowser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
    {             
          MessageBox.Show(webBrowser.DocumentText.ToString());
    }

Hope this helps...

Ramanand Bhat