I need to download a bunch of HTML pages programatically, but they are behind a login. SO what I need... I think... is to do the following.
- Use an HTTP POST to upload some form data including the username/password.
- Capture the session somehow. Cookies?
- Send a series of HTTP GETs to download the pages I need.
#3 is easy, I do it all the time. I don't have a clue how to do #1 and 2.
P.S. I will also glady accept "Hey dummy, just use program blah to crawl the site."