ansaurus

Question

Converting EBCDIC Char to Hex values (AFP EBCDIC data)

Answer 1

+2 A:

Yes, when you read the text data in as strings, it's storing it internally as Unicode. If you care about the binary values (i.e. the raw bytes) then don't decode it in the first place.

If you really need to do anything with a custom EBCDIC encoding, you can use my open source EBCDIC implementation - but I think you really just need to make up your mind as to whether you're treating this as binary data or text.

Jon Skeet 2009-04-13 16:59:24

Answer 2

+2 A:

Be careful reading AFP data that way. It is big-endian in both byte and bit order. You will need to account for that if you are treating it as binary data, such as parsing through the Structured Fields in a document.

R Ubben 2009-04-13 20:21:48

The structured fields data is what I'm trying to get. Thanks for the input

Tom Alderman 2009-04-13 21:46:24

Answer 3

+1 A:

You can do it like this:

Open the AFP file. Read the first 9 bytes.
Byte 0 should be 0xD3 or 0x5A. Byte 1 and byte 2 will be the length of the SFI, including 8 of the 9 bytes you just read. It is big endian, so the length = byte1 * 256+byte2.
Bytes 3, 4, and 5 is the Structured Field Identifier. If you're looking for printable text, look for PTX, (Presentation Text Element) 0xD3 0xEE 0x9B. Skip ahead length-8 and read the next 9 bytes if you didn't find it.
If you did find a PTX, read length-8 bytes. Parsing through the control sequences to get to the text is a little tricky. The first will start with 0x2b 0xD3, a byte for the length, and byte for what kind of control sequence it is. If this byte is an odd number, the next control sequence will omit the 0x2B 0xD3 header, starting with the length byte instead. This is called "chaining" and was apparently introduced to drive programmers trying to parse this stuff insane.
Skip ahead from the length byte length-1 and press on or just look for the next 0x2B 0xD3; the last control sequence will not be chained, and everything following to the end of the PTX will be EBCDIC. Use Jon Skeet's library (thanks, Jon) and look for the next PTX element.

Sorry I was long-winded. It is doable, but not simple.

R Ubben 2009-04-14 14:30:47

ansaurus

tags:

views:

answers:

Converting EBCDIC Char to Hex values (AFP EBCDIC data)

related questions