views:

1786

answers:

3

When working with Sockets in Java, how can you tell whether the client has finished sending all (binary) data, before you could start processing them. Consider for example:

istream = new BufferedInputStream (socket.getInputStream());
ostream = new BufferedOutputStream(socket.getOutputStream());

byte[] buffer = new byte[BUFFER_SIZE];

int count;
while(istream.available() > 0 && (count = istream.read(buffer)) != -1)
{
    // do something..
}

// assuming all input has been read
ostream.write(getResponse());    
ostream.flush();

I've read similar posts on SO such as this, but couldn't find a conclusive answer. While my solution above works, my understanding is that you can never really tell if the client has finished sending all data. If for instance the client socket sends a few chunks of data and then blocks waiting for data from another data source before it could send more data, the code above may very well assume that the client has finished sending all data since istream.available() will return 0 for the current stream of bytes.

+5  A: 

I think this is the task more of a protocol, assuming that you are the man who writes both the transmitting and receiving sides of application. For example you could implement some simple logic protocol and divide you data into packets. Then divide packets into two parts: the head and the body. And then to say that your head consists of a predefined starting sequence and contains number of bytes in the body. Of forget about starting sequence and simpy transfer number of bytes in the bofy as a first byte of the packet. Then you've could solve you problem.

Nikita Borodulin
+1 to Nikita's answer. If the sender does not close the stream after sending a single message (HTTP keep-alive, for (well-known) example), there is no way you can tell whether all data was transmitted. You _have_ to define the boundaries of your message in the protocol or state (see next comment)
Juris
(cont'd) (again, in the protocol) that all data must be sent in a single chunk. The latter is very unreliable solution, though.
Juris
+1  A: 

as Nikita said this is more of task of protocol. Either you can go by header and body approach or you can send a special character or symbol for end of stream to break processing loop. Something like if you send say '[[END]]' on socket to denote end of stream.

serioys sam
+5  A: 

Yes, you're right - using available() like this is unreliable. Personally I very rarely use available(). If you want to read until you reach the end of the stream (as per the question title), keep calling read() until it returns -1. That's the easy bit. The hard bit is if you don't want the end of the stream, but the end of "what the server wants to send you at the moment."

As the others have said, if you need to have a conversation over a socket, you must make the protocol explain where the data finishes. Personally I prefer the "length prefix" solution to the "end of message token" solution where it's possible - it generally makes the reading code a lot simpler. However, it can make the writing code harder, as you need to work out the length before you send anything. This is a pain if you could be sending a lot of data.

Of course, you can mix and match solutions - in particular, if your protocol deals with both text and binary data, I would strongly recommend length-prefixing strings rather than null-terminating them (or anything similar). Decoding string data tends to be a lot easier if you can pass the decoder a complete array of bytes and just get a string back - you don't need to worry about reading to half way through a character, for example. You could use this as part of your protocol but still have overall "records" (or whatever you're transmitting) with an "end of data" record to let the reader process the data and respond.

Of course, all of this protocol design stuff is moot if you're not in control of the protocol :(

Jon Skeet
The length-prefix as suggested by you/others sounds like a valid solution for conversations using the same socket conn. But assuming we are dealing with a one-time send/receive cycle, would the same approach be ideal here? The reason I ask follows your note "keep calling read() until it (contd.)
Mystic
returns -1" What does end-of-stream actually mean, and what does it mean to keep reading until read() returns -1? Closing the client socket's output-stream would certainly be an end of stream to the server socket, but that isn't really useful if we intend to send back something over the same conn.
Mystic
So it seems to me that checking for end-of-stream (i.e -1) is only useful in a one-way message between a client and server, where the client socket is subsequently closed after sending a message.
Mystic
Well, it can be two ways like HTTP 1.0 - client makes a request, server recognises the end of the request, responds, and closes the socket when it finishes the response. But yes, it's for "one shot" messaging rather than a conversation.
Jon Skeet
What I'm not immediately sure of (which is somewhat shocking) is whether the two streams are effectively independent - whether you can close the client to server output stream, let the server's *input* stream see "end of data" and then the server respond with data. Worth checking.
Jon Skeet
The trouble with closing the client to server output stream is that it closes the client socket as well, not just the stream. So while the server can potentially send back data, we can't receive it.
Mystic
So I wonder, when does an "end of stream" really occur apart from the client closing the socket? I can't think of a proper example where I'd use "keep calling read() until it returns -1" except for a one-way message, where a client only needs to send data and not receive, ie:open->send data->close.
Mystic