ansaurus

Question

Answer 1

+6 A:

Assuming your target system is an IBM mainframe or midrange, it has full support for all of the EBCDIC encodings built into it's JVM as encodings named CPxxxx, corresponding to the IBM CCSID's (CP stands for code-page). You will need to do you translations on the host-side since the client side will not have the necessary encoding support.

Since Unicode is DBCS and greater, and supports every known character, you will likely be targeting multiple EBCDIC encodings; so you will likely configure those encodings in some way. Try to have your client Unicode (UTF-8, UTF-16, etc) only, with the translations being done as data arrives on the host and/or leaves the host system.

Other than needing to do translations host-side, the mechanics are the same as any Java translation; e.g. new String(bytes,encoding) and String.getBytes(encoding), and the various NIO and writer classes. There's really no magic - it's no different than translating between, say, ISO 8859-x and Unicode, or any other SBCS (or limited DBCS).

For example:

byte[] ebcdta="Hello World".getBytes("CP037");  // get bytes for EBCDIC codepage 37

You can find more information on IBM's documentation website.

Software Monkey 2009-04-21 06:49:06

Answer 2

+2 A:

EBCDIC has many 8-Bit Codepages. Many of them are supported by the VM. Have a look at Charset.availableCharsets().keySet(), the EBCDIC pages are named IBM... (there are aliases like cp500 for IBM500 as you can see by Charset.forName("IBM500").aliases()).

There are two problems:

if you have characters included in different code pages of EBCDIC, this will not help
i am not sure, if these charsets are available in any vm outside windows.

For the first, have a look at this approach. For the second, have a try on the desired target runtime ;-)

Arne Burmeister 2009-04-21 07:23:32

Answer 3

A:

For the midrange AS/400 (IBM i these days) the best bet is to use the IBM Java Toolkit (jt400.jar) which does all these things transparently (perhaps slightly hinted).

Please note that inside Java a character is a 16 bit value, not an UTF-8 (that is an encoding).

Thorbjørn Ravn Andersen 2009-04-21 14:27:39

Answer 4

+1 A:

You can always make use of the IBM Toolbox for Java (JTOpen), specifically the com.ibm.as400.access.AS400Text class in the jt400.jar.

It goes as follows:

int codePageNumber = 420;
String codePage = "CP420";
String sourceUtfText = "Ahmad Yousef Saleh";

AS400Text converter = new AS400Text(sourceUtfText.length(), codePageNumber);
byte[] bytesData = converter.toBytes(sourceUtfText);
String resultedEbcdicText = new String(bytesData, codePage);

I used the code-page 420 and its corresponding java representation of the encoding CP420, this code-page is used for Arabic text, so, you should pick the suitable code-page for Chinese text.

Ahmad 2009-08-17 07:11:54

ansaurus

tags:

views:

answers:

UTF-8 to EBCDIC in Java

related questions