views:

448

answers:

1

Hi,

I've been through different tutorials and this website, but couldn't find a proper solution. On the other hand, I've seen apps logging into websites and requesting further information, so I'm sure there's a way to get this working, but maybe my approach is all wrong.

Here's what I'm trying to do: I want to log into a website that needs user authentication and then read and parse websites that are only accessible if the user is logged in. The problem: after POSTing the credentials to the website, I receive a cookie which doesn't seem to be preserved in my HttpClient, even though the docs suggest that exactly that should happen.

Here's some of my code:

DefaultHttpClient httpclient = new DefaultHttpClient();
HttpPost httpost = new HttpPost(LOGIN_URL);

List<NameValuePair> nvps = new ArrayList<NameValuePair>();
nvps.add(new BasicNameValuePair(USER_FIELD, login));
nvps.add(new BasicNameValuePair(PASS_FIELD, pw));
nvps.add(new BasicNameValuePair(REMEMBERME, "on"));

httpost.setEntity(new UrlEncodedFormEntity(nvps, HTTP.UTF_8));

HttpResponse response = httpclient.execute(httpost);
HttpEntity entity = response.getEntity();

if (entity != null) {
  entity.consumeContent();
}

List<Cookie> cookies = httpclient.getCookieStore().getCookies();

When I output the contents of "cookies", everything seems fine (I receive a session):

- [version: 0][name: ASP.NET_SessionId][value: xxx][domain: xxx][path: /][expiry: null]

As I understood, the cookie/session will be preserved and used in my HttpClient as long as I don't close it.

When reading the next page (which is restricted), using this code:

HttpGet httpget2 = new HttpGet(RESTRICTED_URL);
response = httpclient.execute(httpget2);
entity = response.getEntity();
InputStream data = entity.getContent();
// data will be parsed here
if (entity != null) {
    entity.consumeContent();
}
// connection will be closed afterwards

If I output the response of the GET-request (using response.getStatusLine()) I get a "200 OK" message, but parsing the site that is returned shows, that the login is lost (I only retrieve a login form).

Any help is appreciated.

+1  A: 

Assuming your httpclient object is the same in both cases, and assuming the RESTRICTED_URL is in the same domain as the LOGIN_URL, then I would think what you have should work.

You might wish to use Wireshark or a proxy or something to examine the HTTP requests you are making, to see if the cookie is actually being attached to the request. It may be that the cookie is being attached, in which case there is something else wrong that is causing your second request to fail.

CommonsWare
`httpclient` is the same for all requests and the URLs are both on the same domain (both without SSL). I'll try Wireshark to find out what is being sent, thanks for the hint.
Select0r
I've tried it: the cookie is attached to the second (GET) request and I receive a "302 Found"-message which will show the login screen.
Select0r
@Select0r: sounds like something else is then wrong with that second request (e.g., server is expecting a `Referer:` header).
CommonsWare
Sounds reasonable, thanks. I'll use Wireshark to analyze the traffic when I login to the website using a browser and get back here as soon as I find out the differences.
Select0r
That didn't help, unfortunately, I'll have to investigate further. I think it's possible that the domain I'm trying to login to uses a more complicated mechanism, so I'll try my script with a test-script on another server first.
Select0r

related questions