views:

838

answers:

7

How do you login to a webpage and retrieve its content in C#?

+3  A: 

Look at System.Net.WebClient, or for more advanced requirements System.Net.HttpWebRequest/System.Net.HttpWebResponse.

As for actually applying these: you'll have to study the html source of each page you want to scrape in order to learn exactly what Http requests it's expecting.

Joel Coehoorn
+4  A: 

That depends on what's required to log in. You could use a webclient to send the login credentials to the server's login page (via whatever method is required, GET or POST), but that wouldn't persist a cookie. There is a way to get a webclient to handle cookies, so you could just POST the login info to the server, then request the page you want with the same webclient, then do whatever you want with the page.

Alex Fort
+1  A: 

Use the WebClient class.

Dim Html As String

Using Client As New System.Net.WebClient()
    Html = Client.DownloadString("http://www.google.com")
End Using
Josh Stodola
I did not know about DownloadString - awesome - thanks!
Slee
Why was this downvoted?
Joel Coehoorn
He asked for C# code, probably (it wasn't me that downvoted it)
JohnFx
The C# is almost identical- just a parenthese, braces, a semi-colon, and change the case of a few keywords.
Joel Coehoorn
Come on, people. If you can read/write C#, you can read/write VB. Open your mind!
Josh Stodola
I downvoted because he mentioned that he needs to login before he downloads the page. The stock webclient wouldn't support cookies for a login session, so the solution didn't exactly fit the problem.
Alex Fort
It could work with cookieless sessions, I guess.
Josh Stodola
The C# answer (mine) got downvoted too. I suppose it was the authentication thing.
JohnFx
+2  A: 

How do you mean "login"?

If the subfolder is protected on the OS level, and the browser pops of a login dialog when you go there, you will need to set the Credentials property on the HttpWebRequest.

If the website has it's own cookie-based membership/login system, you will have to use HttpWebRequest to first response to the login form.

James Curran
A: 

Try this:

public string GetContent(string url)  
{ 
  using (System.Net.WebClient client =new System.Net.WebClient()) 
  { 
  return client.DownloadString(url); 
  } 
}
JohnFx
A: 

You can use the build in WebClient Object instead of crating the request yourself.

WebClient wc = new WebClient();
wc.Credentials = new NetworkCredential("username", "password");
string url = "http://foo.com";   
try
{
 using (Stream stream = wc.OpenRead(new Uri(url)))
 {
  using (StreamReader reader = new StreamReader(stream))
     {
         return reader.ReadToEnd();
             }
 }
}
catch (WebException e)
{
 //Error handeling
}
Matthew M. Osborn
+1  A: 
string postData = "userid=ducon";
            postData += "&username=camarche" ;
            byte[] data = Encoding.ASCII.GetBytes(postData);
            WebRequest req = WebRequest.Create(
                URL);
            req.Method = "POST";
            req.ContentType = "application/x-www-form-urlencoded";
            req.ContentLength = data.Length;
            Stream newStream = req.GetRequestStream();
            newStream.Write(data, 0, data.Length);
            newStream.Close();
            StreamReader reader = new StreamReader(req.GetResponse().GetResponseStream(), System.Text.Encoding.GetEncoding("iso-8859-1"));
            string coco = reader.ReadToEnd();