tags:

views:

20

answers:

1

HI,

I have an incoming file that will pass a BizTalk mapper. I need to identify if there is a 3byte chinese character in one of the field of the file (file is an xml). I already got an idea how to find the 3 byte character. However, How can I convert this into its Hex Value? The Hex value is that I will send to the output schema then send to a DB2 server.

Thanks.

A: 

I'm assuming your are dealing with UTF-8. Is that true?

If so, you want something like:

((c0 & 0xFFFF) << 12) | ((c1 & 0xFFFFFF) << 6) | (c2 & 0xFFFFFF)

Scott Wisniewski
Thank you so much Scott. I have a code that code that converts the character length of a char array into DBCS length. Can you help me identify where the figures came from?if (c[length]==32){c[length] = (char)12288;}if(c[length]<127){ c[length]=(char)(c[i]+65248);}How can i use the same approach in converting an MBCS to DBCS?
lightyearsaway
The formula I gave is based on UTF-8, which is a particular type of multibyte character encoding. Is that the encoding scheme your characters are in? One way to check is to look at the 3 characters in the multiple byte char. The binary representation of the first one should start with 1110 , and the binary representation of next 2 should start with 10. If that's not true, then you don't have UTF-8, and the code I showed you won't work. If it is true, then the code I showed you "removes" the Unicode "control bits" (1110, 10, 10) from the 3 characters and combines the remaining bits into an int.
Scott Wisniewski
Thanks so much :)
lightyearsaway