views:

81

answers:

2

Is it possible to use java socket API to read content of a webpage, ex: "www.yahoo.com"? Can somebody here show an example?

And how about reading content of a page protected by the web app login screen?

Thanks in advance, dara kok

+3  A: 

It's possible but not advisable. The webpage is returned using HTTP, which is more than just a stream of bytes. This means that in order to use a socket you application would need to understand the instructions in the HTTP responses and behave accordingly.

To programitically access a webpage use Jakarta Commons HTTP Client.

With regards to secure webpages, it will depend on how they are secured, however given HTTP Client can maintain cookies you should be able to perform the login through code too.

Nick Holt
+1  A: 

Further to Nick's answer (i.e. use the Jakarta commons HTTP Client). The login security depends on how the login page is implemented, if it is an apache .htaccess secured site you will need to place username/password information in the request header. Alternatively (and generally more usual), if it is an html form, you will need to deconstruct the form fields from the original HTML and send those as key/value parameters in the http GET/POST request

James B