views:

93

answers:

4

I want to write a function

String getName(int codePoint) {
    // ????
}

which will return the standard name given to the character that the given code point represents. For example

getName(0);

would return the String "NULL" and

getName(33);

would return the String "EXCLAMATION POINT".

Is there anything in the JDK for this?

+1  A: 

Can this be of any help?

Ashalynd
A: 

No, the standard JDK does not come with that information. It can decode and encode in various schemes, but include all names in any packaging would require a huge download.

As pointed out, The Specialist Lexical Tools might help. Beware. It is a 830 Mb donwload.

Marcelo Morales
+2  A: 

The Lexical Tools Java 6.0, UTF-8 , 2009 Release mentioned by Ashalynd looks like it has a Get Unicode Name feature. However, behind the scenes it is using ICU4J from the ICU Project. ISU4J has a UCharacter.getName() function that may be of use.

Kevin Hakanson
+2  A: 

Part of the unicode standard is a file UnicodeData-<Version>.txt (the download from unicode.org does not work) that contains the name (and reading direction, toLowerCase and toUpperCase etc.):

0021;EXCLAMATION MARK;Po;0;ON;;;;;N;;;;;

So 0x21 is called EXCLAMATION MARK by the unicode standard.

This could be buried somewhere in the JRE in some form as it is used to convert to lower and upper case characters.

Thomas Jung
or you could use that file to roll your own getName function
Kevin Hakanson