views:

53

answers:

2

I want to do a manual GET with cookies in order to download and parse a web page. I need to extract the security token, in order to make a post at the forum. I have completed the login, have read the response and extracted the cookies (3 pairs of (name,value) ). I then wrote the String containing the cookies like this:

 CookieString="name1=value1; name2=value2; name3=value3"

I then do the following

HttpURLConnection connection
connection = (HttpURLConnection)(new URL(Link).openConnection());
connection.setRequestMethod("GET");
connection.setRequestProperty("Connection", "Keep-Alive");
connection.setRequestProperty("Cookie", CookieString );
connection.connect();

I then read the page but it shows that I am not logged at the forum. What am I doing wrong?

edit: I know that I must extract the security token if I want to make a post. My train of thought was that in order to extract it, I need to GET this particular page. But for the security token to be as a hidden field I must be online, thus I needed the cookies. But when I GET the page and I set the cookies as mentioned above i get the page as a guest, it shows that I am not online and the value of security token is guest which is not useful for me. I will check the link you gave me and hopefully will find a solution.

+1  A: 

To be sure, you should be gathering the cookies from the response's Set-Cookie headers. To send them back in the subsequent requests, you should set them one by one using URLConnection#addRequestProperty().

Basically:

// ...

// Grab Set-Cookie headers:
List<String> cookies = connection.getHeaderFields().get("Set-Cookie");

// ...

// Send them back in subsequent requests:
for (String cookie : cookies) {
    connection.addRequestProperty("Cookie", cookie.split(";", 1)[0]);
}

// ...

The split(";", 1) is there to get rid of cookie attributes which are irrelevant for the server side like expires, path, etc.

For a more convenienced HTTP client I'd suggest to have a look at Apache HttpComponents Client. It can handle all the cookie stuff more transparently.

See also:


Update: as per the comments, this is not a cookie problem. A wrong request token means that the server has CSRF/bot prevention builtin (to prevent people like you). You need to extract the token as a hidden input field from the requested page with the form and resend it as a request parameter. Jsoup may be useful to extract all (hidden) input fields. Don't forget to pass the name-value pair of the button as well which you'd like to "press" programmatically. Also see the abovementioned link for more hints.

In the future, you should really be more clear about the exact error you retrieve and not guess something in the wild. Copypaste the exact error message and so on.

BalusC
That's what I do and it doesn't work. I read the cookies from the Response's Set-Cookie headers. I used to set them all together with the setRequestProperty. I tried one-by-one wuth the addRequestProperty but the result was the same. I can't find what I am doing wrong. I would prefer to avoid using a 3rd party library if that was possible.
fysob
Then the problem lies somewhere else. What is the URL? What does a HTTP tracker tool like Fiddler say?
BalusC
The URL is a post reply page of a vbulletin forum. The strange thing is that when I use the cookies at the same link with a POST method to post a reply, the response is a page that says that I do not have a valid security token, but it sees that I am online, therefore the cookies work. When I try to GET the same page with the same cookies in order to extract the security token It asks me to log in. I will install fiddler, see how it works and will get back to you. If u have any ideas plz tell me.
fysob
A: 

Assuming the cookie values are not hard-coded but obtained from a previous request, it's probably easiest to use the CookieHandler class.

CookieHandler.setDefault(new CookieManager());

Then your HttpURLConnection will automatically save any cookies it receives and send them back with the next request to the same host.

finnw