views:

620

answers:

3

The localization saga continues...

So I'm trying to support collation of chinese text in my iPhone app, and after talking to a native chinese speaker, I think I understand how the chinese do it...

Lets say you had the string 巴拉克·奥巴马 and you wanted to figure out which section of the chinese phonebook to put it in (in this example I'm ignoring firstname/lastname and just using the first character of the string)...

First you transliterate it into pinyin, which gives you "balake aobama" Then you collate based on the first character of that string: "b"

So the question is, how can I go from 巴拉克·奥巴马 to balake aobama using the iPhone SDK? It looks like the ICU library, which ships on the phone, can do this kind of transliteration, but I'm not sure if I can use it easily from my code, and even if I can, I don't know if the transliteration stuff is included in the build of ICU that comes on the phone.

If ICU is a no-go, does anyone have any better ideas?

+1  A: 

You pretty well must do it with a lookup table. Remember that each hanzi character has (at least) one reading, but the connection between the character and the way it's sounded is irregular, and sometimes arbitrary.

Charlie Martin
Can you point me any references I can use to make this lookup table? I'm pretty new to this pinyin stuff...
Mike Akers
well, to some extent that's a chinese dictionary. Have a look at CEDICT: http://www.mdbg.net/chindict/chindict.php?page=cedict
Charlie Martin
A: 

Pardon my ignorance, but doesn't -localizedCompare: or -compare:options:range:locale: between two NSStrings do this collation ordering work for you?

Ashley Clark
-localizedCompare isn't doing the Right Thing here. I'll check out -compare:options:range:locale:
Mike Akers
A: 

In recent versions of the iPhone SDK (3.0 and later,) the UILocalizedIndexedCollation class can do collation for chinese and all other languages supported on iPhone.

Mike Akers