views:

77

answers:

2

Hi, I am working on screen scraping and done successfully in 3 websites, I have an issue in last website

here is my url, When I hit with my parameter, it is showing result on next page, simply posting to other page and showing the result fine on other page

Here is My Test

However, when I hit from my application, since here I don't have an option to post, it only fetch html of requested page that is obviously my above mention HTML test link, that actually have parameter in URL to get the result.

How can I handle this situtation? Please give me hint.

Thanks

here is my C# code, I am using HTMLAgality

String url;
HtmlWeb hw = new HtmlWeb();
HtmlDocument doc;
url = "http://mysampleURL";
doc = hw.Load(url);
+2  A: 

If the resource requires a POST, then you MUST submit a POST.

This is a fairly simple task. Here is an example from Rick Strahl's blog. The code is a bit rustic but works and will get you heading the right direction

string lcUrl = "http://www.west-wind.com/testpage.wwd";
HttpWebRequest loHttp =
   (HttpWebRequest) WebRequest.Create(lcUrl);

// *** Send any POST data
string lcPostData =
   "Name=" + HttpUtility.UrlEncode("Rick Strahl") +
   "&Company=" + HttpUtility.UrlEncode("West Wind ");

loHttp.Method="POST";
byte [] lbPostBuffer = System.Text.           
                       Encoding.GetEncoding(1252).GetBytes(lcPostData);
loHttp.ContentLength = lbPostBuffer.Length;

Stream loPostData = loHttp.GetRequestStream();
loPostData.Write(lbPostBuffer,0,lbPostBuffer.Length);
loPostData.Close();

HttpWebResponse loWebResponse = (HttpWebResponse) loHttp.GetResponse();

Encoding enc = System.Text.Encoding.GetEncoding(1252);

StreamReader loResponseStream =
   new StreamReader(loWebResponse.GetResponseStream(),enc);

string lcHtml = loResponseStream.ReadToEnd();

loWebResponse.Close();
loResponseStream.Close();
Sky Sanders
I have tried Ombergen solution and it works fine, your solution seems to be a good as well, but I have not tried, anyway I am upvoting. Thanks for your time.
Muhammad Akhtar
+3  A: 

Use the WebClient class for posting the form of the first page with the expected input values. The input values can be found in the source of the first page, but it's also possible to capture them using Fiddler which is imho a great tool for these scenarios.

Example:

NameValueCollection values = new NameValueCollection();
values.Add("action","hotelPackageWizard@searchHotelOnly");
values.Add("packageType","HOTEL_ONLY");
// etc..
WebClient webclient = new WebClient();
webclient.Headers.Add("Content-Type","application/x-www-form-urlencoded");
byte[] responseArray = webclient.UploadValues("http://www.expedia.com/Hotels?rfrr=-905&","POST", values);
string response = System.Text.Encoding.ASCII.GetString(responseArray);
Ombergen
that's grate, very smart solution. Thanks alot. you solved my problem. thanks again.
Muhammad Akhtar