String
class has a constructor:
new String(byte[] bytes, Charset charset)
and a method:
byte[] getBytes(Charset charset)
Given that I define my charset
as follows:
Charset charset = Charset.forName("UTF-8");
What kind of encoding I will in fact use? More specifically is it a standard UTF-8 (as described in RFC 3629), or CESU-8, or Modified UTF-8? (See also corresponding Wikipedia article)
In case if it's not a standard UTF-8 is there a library that allows String operations in utf8?
A converter for these UTF-8-derived encodings is more than welcomed!