views:

1420

answers:

4

I am trying to make a request to a webpage that requires cookies. I'm using HTTPUrlConnection, but the response always comes back saying

<div class="body"><p>Your browser's cookie functionality is turned off. Please turn it on.

How can I make the request such that the queried server thinks I have cookies turned on. My code goes something like this.

private String readPage(String page) throws MalformedURLException {
 try {
  URL url = new URL(page);
  HttpURLConnection uc = (HttpURLConnection) url.openConnection();
  uc.connect();

  InputStream in = uc.getInputStream();
  int v;
  while( (v = in.read()) != -1){
   sb.append((char)v);
  }
  in.close();
  uc.disconnect();
 } catch (IOException e){
  e.printStackTrace();
 }
 return sb.toString();
}
+1  A: 

uc.getHeaderFields() // get cookie (set-cookie) here

    URLConnection conn = url.openConnection();
    conn.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 6.0; pl; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2");
    conn.addRequestProperty("Referer", "http://xxxx");
    conn.addRequestProperty("Cookie", "...");
n00b32
+5  A: 

You need to add a CookieHandler to the system for it handle cookie. Before Java 6, there is no CookieHandler implementation in the JRE, you have to write your own. If you are on Java 6, you can do this,

  CookieHandler.setDefault(new CookieManager());

URLConnection's cookie handling is really weak. It barely works. It doesn't handle all the cookie rules correctly. You should use Apache HttpClient if you are dealing with sensitive cookies like authentication.

ZZ Coder
now i need to figure out how to do http proxy authentication. i had it working with URLConnection, but now I need to figure it out here. tsok, google will prolly find something for me ;) thanks!
dharga
A: 

If you're trying to scrape large volumes of data after a login, you may even be better off with a scripted web scraper like WebHarvest (http://web-harvest.sourceforge.net/) I've used it to great success in some of my own projects.

Alex Marshall
+1  A: 

I think server can't determine at the first request that a client does not support cookies. So, probably server sends redirects. Try to disable redirects:

uc.setInstanceFollowRedirects(false);

Then you will be able to get cookies from response and use them (if you need) on the next request.

serge_bg