views:

348

answers:

2

We have a Japanese client that has source code in COBOL on an mainframe. He claims the code on the mainframe is represented in Shift-JIS2 (and we think we understand that pretty well). When that code is transferred to an PC, what is the most common encoding used? We've sent him a program to process that COBOL code and it seems to choke. The customer won't give us the code directly, so experiments are hard. His experiments seem to indicate UTF-8; I assume the Japanese characters encodable in Shift-JIS2 are correspondingly converted to Unicode equivalents. Anybody have any experience here?

EDIT: I think we solved our mystery. The client is (duh!) using CP-932 ("ShiftJIS") on the PC, but his COBOL program has Japanese characters in the identifiers, and that's why our tool is choking.

EDIT: Followup: A bit more of a surprise. SHIFT-JIS often encodes what we think of as ASCII text as so-called "FULLWIDTH" characters, that take the same screen space as an East Asian ideograph; conventionalo ASCII characters act as half-width. So, there's a FULLWIDTH "A" , "B", ... "Z" as well as FULLWIDTH "-". Apparantly, to process Japanese COBOL, our COBOL parser has to accept not only Western ASCII, but also the FULLWIDTH equivalents, esp. the FULLWIDTH letters and surprisingly a FULLWIDTH HYPHEN used to seperate "letters" in a COBOL identifier.

EDIT: IBM Enterprise COBOL allows DBCS characters in identifiers. Yikes!

+2  A: 

There's three encodings that are all still very much in use in Japan: EUC-JP, ISO-2022-JP, and Shift-JIS.

ISO-2022-JP is usually used for E-mails. While you'll see EUC-JP in Unix machines. I personally haven't dealt with anything other than Shift-JIS though. (Nor mainframes.)

wm_eddie
You get the nod, for saying the obvious which I guess we didn't believe :-{
Ira Baxter
See edits on my original question for complications involving FULLWIDTH characters.
Ira Baxter
A: 

Source code is a text file. How is the text file copied? Is the mainframe running Linux with a Samba server? An FTP server? A web server? nfs?

What is the client running? Windows explorer? An FTP command? A web browser? nfs?

Windows programmer
I'm asking pointed questions about precisely how the file was transferred. More when I get some data...
Ira Baxter
... it appears that the customer is using Windows code page 932 (Shift-JIS) on the PC.
Ira Baxter