Internally, write(int)
will just cast its parameter to char
, so write(i)
is equivalent to write((char)i)
.
Now in Java, internally char
is just an integer type, with the range 0-65535 (i.e. 16 bit). The cast int -> char is a "narrowing primitive conversion" (Java Language spec, 5.1.3), and int
is a signed integer, hence:
A narrowing conversion of a signed
integer to an integral type T simply
discards all but the n lowest order
bits, where n is the number of bits
used to represent type T. In addition
to a possible loss of information
about the magnitude of the numeric
value, this may cause the sign of the
resulting value to differ from the
sign of the input value.
That's why the Javadoc says that only the lower two bytes are written.
Now, what this means in terms of characters depends on how you want to interpret the int values. A char
in Java represents a Unicode code point in UTF-16, that is the 16 bit number represented by the char is interpreted as the number of the Unicode code point. So if each of your int values is the number of a 16 bit code point, you're fine (actually, this is only true for characters in the BMP; if you use characters in the supplementary planes, each Unicode code point will be encoded into two char
s). If it's anything else (including a code point with more than 16 bit, or a negative number, or something else entirely), you'll get garbage.
What effect, if any, does this have on
writing non-utf8 chars which have been
cast to an int?
There is no such thing as a "non-utf8 char". UTF-8 is an encoding, that is a way to represent a Unicode code point, so the question as posed is meaningless. Maybe you could explain what your code does?