views:

145

answers:

3

While working on a WebSocket server in Java I came across this strange bug. I've reduced it down to two small java files, one is the server, the other is the client. The client simply sends 0x00, the string Hello and then 0xFF (per the WebSocket specification).

On my windows machine, the server prints the following:

Listening
byte: 0
72 101 108 108 111 recieved: 'Hello'

While on my unix box the same code prints the following:

Listening
byte: 0
72 101 108 108 111 -3

Instead of receiving 0xFF it gets -3, never breaks out of the loop and never prints what it has received.

The important part of the code looks like this:

byte b = (byte)in.read();
System.out.println("byte: "+b);

StringBuilder input = new StringBuilder();
b = (byte)in.read();
while((b & 0xFF) != 0xFF){
 input.append((char)b);
 System.out.print(b+" ");
 b = (byte)in.read();
}
inputLine = input.toString();

System.out.println("recieved: '" + inputLine+"'");
if(inputLine.equals("bye")){
 break;
}

I've also uploaded the two files to my server:

My Windows machine is running windows 7 and my Linux machine is running Debian

Edit:
When b is an int, it still acts strange. I send 0xFF (255) but receive 65533 (not 65535 or 255).

+8  A: 

The problem isn't in the code you've shown. It's here:

in = new BufferedReader(new InputStreamReader(socket.getInputStream()));

You're dealing with binary data so you should be using the raw stream - don't turn it into a Reader, which is meant for reading characters.

You're receiving 65533 because that's the integer used for the "Unicode replacement character" used when a value can't be represented as a real Unicode character. The exact behaviour of your current code will depend on the default character encoding on your system - which again isn't something you should rely on.

Further, you're assuming each byte should translate to a single character - essentially you're assuming ISO-8859-1. I haven't checked the spec, but I doubt that that's what you should be using.

Finally, you're not checking for b being -1 - which is used to indicate that the client has closed the stream.

Jon Skeet
interesting....
aioobe
Ah, thanks, the reader was the problem :D
Marius
A: 

A byte with the value of -3 has a bit pattern of 11111101. And in with the value of -3 has the bit pattern of 11111111111111111111111111111101

So, you are getting essentially the same value. I wish I understood why you are getting -3.

Jon Strayer
@Jon: See my answer... it's because it's *actually* the Unicode replacement character.
Jon Skeet
A: 

And your EOS check is incorrect. You should read into an int and compare it to -1. If true, you have reached the end of the stream so close the socket, or more probably the output stream, and proceed accordingly. Otherwise cast it to a byte. At the moment you are unable to transmit 0xff because it will get treated the same as EOS.

EJP