tags:

views:

1370

answers:

4

I'm not a Java programmer at all. I try to avoid it at all costs actually, but it is required that I use it for a class (in the school sense). The teacher requires that we use Socket(), BufferedReader(), PrintWriter() and various other things including BufferedReader()'s readLine() method.

Basically, this is the problem I'm having. The documentation clearly states that readLine should return a null at the end of the input stream, but that's not what's happening.

Socket link       = new Socket(this.address, 80);
BufferedReader in = new BufferedReader( new InputStreamReader( link.getInputStream() ));
PrintWriter   out = new PrintWriter(    new PrintWriter(       link.getOutputStream(), true ));

out.print("GET blah blah blah"); // http request by hand
out.flush(); // send the get please

while( (s=in.readLine()) != null ) {

    // prints the html correctly, hooray!!
    System.out.println(s);
}

Instead of finishing at the end of the HTML, I get a blank line, a 0 and another blank line and then the next in.readLine() hangs forever. Why? Where's my null?

I tried out.close() to see if maybe Yahoo! was doing a persistent http session or something (which I don't think it would without the header that we're willing to do it).

All the Java sockets examples I'm finding on the net seem to indicate the while loop is the correct form. I just don't know enough Java to debug this.

+3  A: 

So you're reading from a socket (you don't show that in your code, but that's what I gather from the text)?

As long as the other side is not closing the connection, Java doesn't know that it's at the end of the input, so readLine() is waiting for the other side to send more data and doesn't return null.

Jesper
... reading from a socket, yes. I added that to the question, thanks.
jettero
+2  A: 

Try GET url HTTP/1.0. The HTTP/1.0 tells the server that you can't handle more than a single document per connection. In this case, the server should close the connection after sending you the result.

Aaron Digulla
Yeah, that makes sense and it works. In other languages I would simply close my side of the socket so the webserver knows there's only going to be one request... but when I out.close() it seems to close both sides of the link. Is there a way to close exactly one side? Am I misinterpreting what happens?
jettero
You're right, calling close on `out` will also close the incoming stream. IIRC, that's a bug/feature in Javas URL handler class.
Aaron Digulla
+3  A: 

Your problem is the content encoding “chunked”. This is used when the length of the content requested from the web server is not known at the time the response is started. It basically consists of the number of bytes being sent, followed by CRLF, followed by the bytes. The end of a response is signalled by the exact sequence you are seeing. The web server is now waiting for your next request (this is also called “request pipelining”).

You have several possibilities:

  • Use HTTP version 1.0. This will cause the webserver to automatically close the connection when a response has been sent completely.
  • Specify the “Connection: close” header when sending your request. This will also close the connection.
  • Parse content encoding “chunked” correctly and simply treat this as if the response is now complete—which it is.
Bombe
This makes sense. I needed to check the headers. I'm just going to use http1.0.
jettero
Yes, for a school exercise that is the most sensible thing to do.
Bombe
A: 

Your HTTP request is not complete without 2 carriage return + linefeed pairs. You should probably also call close after the request is sent:

out.print("GET /index.html HTTP/1.0\r\n");
// maybe print optional headers here
// empty line
out.print("\r\n");
out.flush();
out.close();
Jörn Horstmann