tags:

views:

48

answers:

3

Hi, I got an unicode string from an external server like this:

005400610020007400650020007400ED0020007400FA0020003F0020003A0029

and I have to decode it using java. I know that the '\u' prefix make the magic (i.e. '\u0054' -> 'T'), but I don't know how transform it to use as a common string.

Thanks in advance.

Edit: Thanks to everybody. All the answers work, but I had to choose only one :(

Again, thanks.

+2  A: 

You can simply split the String in Strings of length 4 and then use Integer.parseInt(s, 16) to get the numeric value. Cast that to a char and build a String out of it. For the above example you will get:

Ta te tí tú ? :)

Moritz
This will only work for UTF-16. Text encoded as UTF-8 will return garbage.
Eyal Schneider
The question suggests that the source is definitely UTF-16 encoded.
Philipp
@Philipp: you are right, it is implicitely mentioned when he sais that '\u' does the magic. But I believe the title deserves a more generic answer:)
Eyal Schneider
+4  A: 

It looks like a UTF-16 encoding. Here is a method to transform it:

public static String decode(String hexCodes, String encoding) throws UnsupportedEncodingException {
    if (hexCodes.length() % 2 != 0)
        throw new IllegalArgumentException("Illegal input length");
    byte[] bytes = new byte[hexCodes.length() / 2];
    for (int i = 0; i < bytes.length; i++)
        bytes[i] = (byte) Integer.parseInt(hexCodes.substring(2 * i, 2 * i + 2), 16);
    return new String(bytes, encoding);
}

public static void main(String[] args) throws UnsupportedEncodingException {
    String hexCodes = "005400610020007400650020007400ED0020007400FA0020003F0020003A0029";
    System.out.println(decode(hexCodes, "UTF-16"));
}

}

Your example returns "Ta te tí tú ? :)"

Eyal Schneider
+1  A: 

It can be interpreted as UTF-16 or as UCS2 (a sequence of codepoints coded in 2 bytes, hexadecimal representation), it's equivalent as long as we do not fall outside the BMP. An alternative parsing method:

 public static String mydecode(String hexCode) {
    StringBuilder sb = new StringBuilder();
    for(int i=0;i<hexCode.length();i+=4) 
      sb.append((char)Integer.parseInt(hexCode.substring(i,i+4),16));
    return sb.toString();
 }

 public static void main(String[] args)  {
    String hexCodes = "005400610020007400650020007400ED0020007400FA0020003F0020003A0029";
    System.out.println(mydecode(hexCodes));
 }
leonbloy