views:

141

answers:

1

"Better" primarily means accuracy, but I am also interested any other criteria in which other systems excel. I sampled the Perl binding Text::Kakasi for correctness in an admitted limited fashion and it works just fine for our needs.

use utf8;
use Encode;
use Text::Kakasi;
use Unicode::Collate;

my $k = Text::Kakasi->new(qw(-iutf8 -outf8 -JH));
my $c = Unicode::Collate->new;

print encode_utf8 $_ for
    map  { $_->[0] }
    sort { $c->cmp($a->[1], $b->[1]) }
    map  { [$_, $k->get($_)] }
    <DATA>;

__DATA__
アメリカ合衆国
アラブ首長国連邦
ロシア連邦
中国
南アフリカ共和国
日本
北京(ペキン)
大阪
東京

Authoritative answers only, please.

A: 

I am not sure about meaning of 'authoritative'.

But I can say Kakashi is well known freeware library and still not obsolete today.

If you can convert Kanji strings to Hiragana(or Katakana) strings by Kakashi, resulting sorting order would be fine.

http://www.utf8-chartable.de/unicode-utf8-table.pl

kmugitani
I was not asking whether the kakasi library is obsolete, but whether there is something better.
daxim