tags:

views:

443

answers:

4

I find that getting Unicode support in my cross-platform apps a real pain in the butt.

I need strings that can go from C code, to a database, to a Java application and into a Perl module. Each of these use a different Unicode encodings (UTF8, UTF16) or some other code page. The biggest thing that I need is a cross-platform way of doing conversions.

What kind of tools, libraries or techniques do people use to make handling these things easier?

+3  A: 

Have a look at this: http://www.icu-project.org/

Sebastian
A: 

How are you doing the cross-platform calls? Is it all called from Java?

http://java.sun.com/docs/books/tutorial/i18n/text/string.html might be useful.

I'm a bit confused about exactly what you are trying to do. Is the database essentially interface between all the code? Then it should be easy - just make the DB UTF-8 and each of the clients will need to do their own conversions.

Sounds like an interesting problem, could you share some more details?

Adrian Mouat
+1  A: 

Perl has Encode as a standard library. It can be used to read/write any encoding you want, so that's not going to be a problem.

Leon Timmermans
A: 

Well. I guess iconv is sufficient for your needs. Iconv should be avaible on any POSIX system by default (those include (GNU/)Linux, *BSD, Mac OS X...). On Windows AFAIK it requires separate library but:

  1. you may just install it/bundle with your software/static compile it. (libiconv for windows). (I'd guess I'd recommend to bundle it).
  2. You may use some native Windows calls as special case.

Of course if you are using Java it has it built-in - but I see that it may not be what you want (JNI calls are expensive).

PS. Cannot you set perl to specific encoding?

Maciej Piechotka