views:

33

answers:

1

Using ASP.net, what methods can I use to do the following:

  1. Open up a connection to a given URL to read HTML content
  2. Parse the given URL for hyperlinks, and place them in an array
  3. Loop through each hyperlink (only 1 level down), opening each one, saving the HTML contents in a table, and move to the next hyperlink until done.

If ASP.net is not up to the task, other languages or free scripts/toolkits would be acceptable.

Thanks.

+1  A: 

I left out the obvious things, such as "loop through the DataTable", etc. A more in-depth answer is probably not something that will be coming from this site. The question is a bit too big to answer completely here.

David Stratton
Sounds good - just a couple of follow-up questions: 1. Does System.Net.WebClient work like a file stream? Any links for a good tutorial, perhaps? 2. Am I right to assuming that the simulated "click" on a hyperlink in the loop is basically to point a WebClient variable to the URL for that array item? I'm new at this, so bear with me - thanks.
Yaaqov
1. yes I believe the method is DownloadFile(), but off the top of my head I'm not sure. Check the documentaiton in the "Methods" section. 2. yes.
David Stratton