views:

32

answers:

2

I need to retrieve a document from a website, and parse it. Problem is that:

  1. The site uses both http and https protocol
  2. You need to log in the site (I have a regular account)
  3. From the login page, there are at least 2 redirect just to log in yourself

I managed an HTTPS connection and posted my login and pass, but I'm having troubles with cookie management and the redirect....

A: 

Using a library like HtmlUnit would probably help.

Damien
+1  A: 

commons-httpclient would help.

bmargulies