I want the pagesource of home page of ORKUT (http://www.ORKUT.com) in java.
But it needs to be logged in to the ORKUT before accessing any page of it. How can I do it. It should not involve browser in between
I want the pagesource of home page of ORKUT (http://www.ORKUT.com) in java.
But it needs to be logged in to the ORKUT before accessing any page of it. How can I do it. It should not involve browser in between
You should have a look at the Commons HTTP Client. With it you can send a POST request with your login data and then use the session ID for further processing.
Two ways of doing that:
1) Buy Octazen that will do that for you and keep the library updated every time Orkut changes something.
2) Use watir to hijack the browser.
Doing with HTTP Client is like fixing a watch with boxing gloves under the water. It does not support JS, you have to work your way through the cookies, parsing, etc.