tags:

views:

199

answers:

5

Hi,

I get a char array from a socket :

char[] cbuf = new char[3];
inputStream.read(cbuf, 0, 3); // read 3 chars in buffer "cbuf", offset = 0

Then when I print that :

System.out.println("r:"+(int)cbuf[0]+" g:"+(int)cbuf[1]+" b:"+(int)cbuf[2]);

I get at some point :

...
r:82 g:232 b:250
r:82 g:232 b:250
r:66 g:233 b:8224

The 8224 value is way more than 255, how can a char contain this value ???

Thank you

+4  A: 

The char type in Java is 16-bit.

If you are looking for an 8 bit datatype consider using byte.

linuxuser27
And it contains Unicode code points.
Thilo
Not if you use characters outside the BMP. What `char` contains is UTF-16 code units.
dan04
+1  A: 

Java uses UTF (not ASCII) to store chars, UTF is 16-bits long so it can contain values up to 65.535.

Edgar Sánchez
Not entirely true. UTF-16 is 16 bits 'long'. UTF-8 is 8 bits 'long'. Although this is a flexible use of the word 'long' as each can have longer sequences for certain code points.
sje397
So I guess Java uses UTF-16, righto?
Edgar Sánchez
JLS 3.1: "...The Java programming language represents text in sequences of 16-bit code units, using the UTF-16 encoding. ..." http://java.sun.com/docs/books/jls/third_edition/html/lexical.html#248597
Carlos Heuberger
+7  A: 

The char primitive in Java in 16 bits wide, to accommodate characters outside the standard ASCII range, using Unicode.

It looks like you're trying to store RGB values in a char[3]. May I suggest a byte[3], or java.awt.Color?

Color c = new Color(255, 255, 240);
Michael Petrotta
Don't you mean UTF-16?
sje397
@sje397: No, I mean [Modified UTF-8 and CESU-8](http://en.wikipedia.org/wiki/UTF-8#UTF-8_derivations).
Michael Petrotta
@Michael Petrotta: No, you mean UTF-16. It's true that Java also uses modified UTF-8, but not for the char type: it simply can't because that would require more than 16 bits for any character >= U+0800. Java 1.4 and earlier uses UCS-2, Java 1.5 switched to UTF-16.
Michael Madsen
@Stephen, @Michael: of course you're right - logically, it has to be so. I've edited my answer.
Michael Petrotta
The hint to use chars instead of bytes is right, however I don't see how `java.awt.Color` is supposed to help him with the task of actually reading the rgb values from the stream.
Grodriguez
@Grodriguez: I imagine that the OP is printing the color channel values just to see what they are, or as a debugging step. He may want to do something with the channels later, and that something may be easier if he's handling the values using idiomatic Java.
Michael Petrotta
That's right, however the way you phrased it is somewhat misleading -- "`byte[3]` or `java.awt.Color`". One thing is directly related to this problem, the other a 'suggestion that may be useful later.'
Grodriguez
@Grodriguez: Something wrong with "a suggestion that may be useful later" (especially when I explicitly call it out as a suggestion)?
Michael Petrotta
Nothing more than what I already explained in my comment above.
Grodriguez
Ooops it seems like byte in java is signed, so when I try to create the color object, I have values < 0 and this throws an exception. How can I force the values to be unsigned ?
Matthieu
int red = 0xFF
KarlP
(and don't use a Reader...)
KarlP
+6  A: 

There is no read(char[], int, int) method on InputStream. You must be calling that on a Reader sub-class (such as InputStreamReader). InputStreamReader automatically converts bytes to characters using the platform default character encoding which in your case looks like it's windows-1252.

The character you received, 8224 is Unicode character U+2020 Dagger '†'. This was probably translated from byte 0x86 (134) using the windows-1252 character encoding.

If you're reading a file containing stuff that isn't text, you need to make sure that you don't read it with a subclass of Reader but use a subclass of InputStream instead. Alternatively, you can use an InputStreamReader and specify a character encoding like ISO-8859-1 that will map every byte to a char with the same numeric value.

Adrian Pronk
To clarify: Find where variable "inputStream" [sic] is created; most likely its created using socket.getInputStream(). Remove the wrapped InputStreamReader and make sure that the raw InputStream is used. Read into a byte[] instead. Make sure that the protocol actually sends 3 bytes in the order you think; And finally, consider using Color.
KarlP
Thanks +1. (I had no idea char was 16 bits, so I accepted the other answer)
Matthieu
+2  A: 

As people have already pointed out, you want to read bytes, not chars (chars are 16 bits in Java), and make sure you are actually using an InputStream and not a Reader.

I also want to point out something that is not directly related to your question: When calling InputStream.read(byte[]) or InputStream.read(byte[], int, int) to read several bytes, do not assume that all requested bytes have been read upon return. The call to read may return as soon as some bytes are available. You should always check the return value to find out how many bytes have actually been read.

The same applies to the read methods in Reader.

Grodriguez